Technics for 4k on CPC
|Hicks - 01 July 2012 - 21:12:17||459 posts|
|I would like to open a little discussion about the 4k category on Amstrad. There is actually 7 different 4k on CPC. I put all theses prod on a single disc, just clic here to download it (useful if you are using HxC with one disc by prod).
Two years ago, there were only one 4k (CngTro), and then 6 in two years! Just see the Pouet page of CPC 4k. That's of course a great thing!
I don't want to explain here why I think that 4k is a very interesting exercice. I'm more interested in speaking about how can we raise the level up with more suitable tools, personal tips, and else. So I will be especially interested by Krusty and Optimus point of view since they already try it, but of course, all the point of view are welcome.
I confess that I also open this thread because I'm thinking about trying my second 4k, and that I would like to avoid the defects and problems I meet during my first try.
- I suppose that everybody is using Exomizer which allow a very high level of compression with a compact code to unpack (close from 200 bytes and maybe less in the last version). There is probably tips about data organisation to think about... Also some discussions about "entropy" will be possible (avoid compressing manually).
- Music is more embarassing. For my first and only try (Moody), I try to optimise the Arkos Tracker player. Player+music was approx. equal to #600 bytes, that's ok for a "bricolage" (do-it-yourself) but a dedicated player will be more performant I believe. Krusty/Optimus: what did you use? I know that Grimmy was working on this kind of compact player, maybe he can speak of that here...
- System friendly! To generate mathematical things, we have to use the System. I must admit I'm not very familiar with that, and that documentation lacks (there is only the Grimware). It will be interesting too to speak about more useful vectors for example...
As you can see, there is a lot of possible direction for this discussion about 4k on CPC! Be free to add your own ideas!
|krusty - 01 July 2012 - 21:58:44||327 posts|
|Maybe I'll put a longer answer later.
- One important thing is to always verify code size AFTER compression and not BEFORE compression. Because writing compact code can compress very badly.
- It is often possible to achieve the same results by coding in different ways.
These ways compress more or less well, so it can be interesting to code different versions of the same routine and select the right routine using conditional assembling. Using the right configuration of routines can increase the crunching.
- I often obtained better results by using unrolled loops generated from the assembler and not code with loops or code with unrolled loops build during the initialization.
- I used Grim's player for the last two 4K. He made a tool which convert Arkos Tracker in a format specific for his player. Obtained results are far better than manual reduction of not useful code of AT.
|Optimus - 02 July 2012 - 11:07:34||334 posts|
|The discussion is interesting, but I don't have much more to add here since I didn't tried to optimize with any special means in my first 4k. I have used exomizer at the end and I use the arkos player unmodified with the very small tune from Factor6 which was 900 bytes uncompressed. Grim's player sounds interesting. Is it free for download?
My real curiosity in 4k is how to precalc some mathematical stuff. Before working on this 4k I had another idea in my mind that I started but never finished, because I needed to precalc square roots and angles for 64*64 pixels, I have tried to use an integer sqrt code already and for that alone it took 12 seconds. I haven't gone through the angle calcs, I thought I would use the firmware calls then but it needed some research on several calls how to convert float to int (or calls doing it for you) before feeding to the atan function (also there is no atan2, you have to do it yourself :P) and I thought already it would be much slower so I lost motivation on this one. Of course I could calculate half pixels and find with middle averages the between to reduce time but still lot's of work. I am wondering, in the 4k intro stop the nyan cat there is a tunnel at the end. I know tunnels need sqrt and angle to generate their shape (even if at the end you have precalced code that map from texture to pixels and never need to touch again the math). But it loads fast enough iirc. What techniques have you used for this 4k?
|krusty - 02 July 2012 - 12:49:34||327 posts|
|The two tunnel arrays are pre-computed and crunch very well.
I'll try toput the source-code online during the week.
Grim has not yet published his conversion tools
|PulkoMandy - 02 July 2012 - 18:21:55||633 posts|
|I'm not sure using the firmware for maths is a good idea, particularly if you do integers. There are Cordic series for all the trigonometry stuff, for example, and if you're already using integers, you likely don't mind extra error...
For example you can approximate the square root :
Even without the loop this may be enough for some effects, or in some cases make them look better by adding a wild distorsion :)
For the music, it seems that Starkos and Arkos Tracker are not the best tools. Using AMC for example is likely to give smaller results. But now, you have to find a musician who wants to use it... Optimizing the player still applies.
In some cases, exomizer applied to YM with registers reorganisation (similar to AYC format) might give interesting results as well. The player for YM is very small (even smaller if you use the firmware to talk to the PSG) and the repetitive register sequences should be packed well by exomizer. (I didn't try).
|Grim - 02 July 2012 - 19:07:30||521 posts|
|The Firmware Math comes for free. A CORDIC (or anything else) implementation doesn't. If you have a lot of math involved in your precalc, Firmware wins. For a few very simple stuff (eg. sinus, square-root), a well crafted custom generator(s) can be shorter than a bunch of firmware CALLs but usually not that much.
AMC produces small music data that doesn't crunch very well (at least not with the usual LZ-based packers). The Soundtrakker128 has a shorter music player than AMC (1175 bytes vs. 1376) and produces larger music-data which compresses relatively well. (Yet both suck in my opinion :)
Crunching raw AY/YM streams works great but only for short music. This method can be improved, see the fast AY player by TmK/Demarche on Speccy.
My player is not publicly released yet, so it doesn't exist thus there's nothing to say about it :)
|CloudStrife - 02 July 2012 - 20:19:19||161 posts|
|PulkoMandy: Well, using a bit exact isqrt are probably lot faster and smaller than using integer division...|
|TotO - 02 July 2012 - 21:16:06||127 posts|
|"For the music, it seems that Starkos and Arkos Tracker are not the best tools."
They are definitively the best tools for composing music for CPC. (when you are musician)
You may said "it was not the more appropriate for replay music on tiny demo", and like Grim explain, it was not really true.
|Grim - 03 July 2012 - 14:39:07||521 posts|
Player+music [in Moody] was approx. equal to #600 bytesJust verified, that's approx. #200 bytes too much (Evil grin on the face :)
|Hicks - 07 July 2012 - 15:58:57||459 posts|
|Nice to see people interested in 4K!
- Krusty is right, we have to check the code/data size always after compression. I would like to go further: there is no real sense in speaking about "#600 uncrunched bytes" because a #600 bytes file can be reduced by 50% or can be uncrunchable (it depends on the content). So we have to speak of "size-after-compression-with-XXX". And as Exomizer seems to always obtain the best results, we can speak of "exomized size".
- Then I can correct what I said: player+music in Moody is not "#600 uncrunched" but approx. "#3E0 exomized". If you (Grim) do better with your player (#C0 exomized bytes better?), that's a great news for the future of 4K :)
- A little initiation to the Math Firmware use will be a very useful article for everyone if someone wants to write it (eyes rolls slowly in Grim's direction :).
- About TmK player: if I understand, it a kind of "synthesis" between a "real player" (managing pitch, vibrato, tracks, etc.) and a "streamer" (managing list of registers). Is it faster or lighter or faster&lighter than a "AY streamer"?
|Grim - 24 July 2012 - 19:18:51||521 posts|
|Using the firmware math is pretty straightforward. All the functions are already documented in the Firmware guide. There are only two tricky things to know: how real numbers are encoded (mantissa and exponent stuff) and, the most important, that your math buffers (to store numbers) should not be located within &0000-&3FFF address-range. (or strange calculations might happen)
About the Fast-AY player, it dumps, encodes & compresses the different patterns of the music separately and replay them in order according to the music song-list. The aim was to get a very fast player, not to achieve awesome compression ratio. The music-player is extremely short & simple (and could be easily modified for the CPC), the music-data is large (but compresses relatively well with Exomizer and the likes). However, IIRC the PC software to encode the music only supports Speccy music formats for now, so it can't really be used "as-is" on CPC (unless you don't mind all the horrible tone sync problems with Speccy-PT3 music playing on CPC). Nonetheless, the idea to encode separately the patterns of a music should be (imo) considered by anyone