Zoom initiation (DMK #8 unreleased)
|Written by Targhan on 18 April 2010|
|How lucky you are! You're going to learn about zooms, so that you can make a fool out of Ecole Buissonnière by this unknown Swedish coder (What is Hoeger Fruzenham?). You will also find a little compatible Winape source here.|
About zoom, I mean the shrinking and stretching of an image. It is possible thanks to a technique called "fixed-point arithmetic". Please note that I will only talk about X zoom to get things simple. Y zoom is exactly the same, but you will have to do it by yourself. I also emphasize on the fact that this article is just an initiation. No optimisation is performed, and the code isn't particularly elegant. The following algorithm is actually not fast enough for most real-time display. However, it will be perfect to generate code that you will call later. The fastest software technique is to have as many coded there are zoom steps and, if the memory allows it, you could even precalculate all the code for every steps of every lines of the image!
What's more, the technique discussed here is not only useful for zooms, but also to play samples at any rate, simulate acceleration and deceleration of objects, and all that your mind can imagine.
Let's get into the heart of the matter, shall we?
WHAT THE HELL IS A ZOOM?
Imagine an image in its original size. Imagine that you zoom a bit in X. What's happening? Some columns are doubled at different places, some aren't. Zooming further, when all the columns are doubled, we have an image that is twice larger than the original, but also twice pixelized. Zooming further, you have columns that are 3 pixels wide, then 4, 5, and so on till your image is just a heap of dirty pixels. Woops, I shouldn't stress on how ugly zooms can get :).
So, when displaying the image, each pixel is asked a question : Are you the same as the one I've just displayed, or are you the next one in the original image? This question is the main problem. When you know the answer, there is no more problem!
Actually, the technique I use doesn't follow the same train of thought, but this one : I take every X pixels of my image and display them one after the other on the screen. If X=1 then my image is the same as the original. If X=2, the result is twice thinner, as we've skipped half of the pixels. But it becomes interesting because X is not an integer. If X=0.5, the image is twice bigger (because we've displayed every pixels twice). X could be 1.45, 0.79, 0.1... The zoom rate can be quite accurate, and by making X go from 0.1 to 4, we could observe a nice looking un-zoom.
So the only question is how to use non-integer numbers, as the Z80 registers are quite... integer?
HERE COMES MISS FIXED-POINT (applause)
The principle of fixed-point arithmetic is simple : we separate in 2 registers the integer part, and the decimal part. We consider the registers are 8 bits, which is enough for what we want to do. But who knows, you may want to use a 16 bits register for the integer part. It will just be more demanding, resource-wise.
Two 8-bit registers. The decimal part can hold a number from 0 to 255. This can seem strange, but we have no choice. Here we will consider these numbers valids: 1.42, 0.193, 456.255, but not 5.256!
Well, that's about it! Yet you didn't learn a lot. We must now implement this X factor that represents the number of pixels separating, in the memory, the point we have displayed to the next one.
In order to simplify things a bit more, our code will work with a byte precision. It will be uglier, but faster and more understandable. If you want something better, you will have to do it yourself.
Consider two 8-bit registers : first, we have ZOOM which is the integer part of our zoom rate. ZDECI is its decimal part. To display our image in its original size, ZOOM=1 and ZDECI=0, so that each pixel is displayed. A little question : what if I want a twice bigger gfx? Simple: ZOOM=0 and ZDECI=128. Don't forget that our registers can only hold values from 0 to 255.
There is another byte which we could have called Boulok or Pastek, but that we decided to call NEXT (it doesn't appear in the source code) and that represents the number of bytes, in the memory, separating the first byte of the current line of the image to the byte that is going to be displayed. NEXT is actually the result of all the previous iterations of the line. Even though NEXT has a decimal part too, we will only use the integer part to select the byte to display.
Little note : in this code, I chose to always display the first byte of the image, whatever happens. The next calculation will depend to it.
1st iteration : We've just displayed the first byte. We have to go to next one. Let's say we want a twice larger image, so : ZOOM=0 and ZDECI=128. So NEXT=0.128! We would like to know how to display the 0.128th byte, but we can't because it doesn't exist on a computer. So we will only use the integer part and display the 0th byte (the first byte of our gfx).
2nd iteration : we display the 0th byte, because NEXT=0.128 (previous calculation). We must calculate the next value of NEXT:
NEXT=NEXT+ZOOM3rd iteration : we display the 1st point (previous calculation).
NEXT=NEXT+ZOOM4th iteration : we display the 1st point.
NEXT=NEXT+ZOOM5th iteration : we display the 2nd point.
NEXT=NEXT+ZOOM6th iteration : we display the 2nd point...
And so on until all things rot. We will talk about ending conditions later. Has everybody understood ? Well, now we have to translate it into the cruel language of assembler. Piece of cake, really.
It's very simple: we're going to use two 8-bit registers, H and L, that will represent NEXT. H is the integer part, L the decimal one. They are both initialized to 0 at the beginning of the process of each line and will increase at each iteration. How much? Well, if we add ZDECI to L, if L overflows, we have to increase H (0.128+0.128=1, isn't it?). Then add to H the ZOOM rate. NEXT has been updated. All we have to do is to add it to the address of the beginning of the current line of your gfx.
In order not to use memory for swapping information, we will use registers as much as possible, including the auxiliary registers.
Little reminder: an EXX switches ONLY the following registers : HL/HL', DE/DE', BC/BC'. That's it! AF and AF' don't swap, and it is done on purpose, so that you can pass value from one set of registers to another through A, without having to pass through memory, which would be slower. To swap AF and AF', use EX AF,AF'. Both these instructions take 1 cycle, and really swap the registers, they don't "crush" them. Two EXXs one after the other won't produce any effect, the same for EX AF,AF'. IX, IY and SP are unique and can't be swapped.
Second reminder: the firmware uses BC' and AF', so it is important to save them if you intend to modify them. Our code will stop the interruptions so that the system doesn't mess with it.
The first set of registers will point on the screen memory, as well as on the beginning of the current line of the gfx. The second set will contain ZOOM and ZDECI, and NEXT. As I said earlier, even though ZOOM needs two registers (one for its integer part, one for its decimal part), only the first is added to the address of the beginning of the current line, in order to calculate what byte to display. We will use A to pass this offset to the other set of registers.
So, for each iteration, here's what we have to do:
1) Display the current byte of the current line of the image.
3) Add our gfx pointer to NEXT
4) Start again till the end of time
Let's have a closer look at each step, now.
1) Display the current byte of the current line of the image. For the first iteration, we simply point at the beginning of the gfx. For now, let's suppose that:
HL=Pointer on the gfx.We know that we are byte-accurate, so whenever we display a byte, we can increase DE. But not HL, because we don't know how many bytes we will skip (if any!). LDI won't work here (unless you decrease HL every time needed, which shouldn't be efficient). So:
LD A,(HL) ;Read one byte of the gfx.2) NEXT=NEXT+ZOOM,DECI
(Please note that ZOOM,DECI means for instance 1.128). It's very very easy, one operation does the trick. First we use the auxiliary registers. In HL we have NEXT (H=integer part, L=decimal part), and in DE we have ZOOM,ZDECI. All you have to do is... add them like 16-bit numbers. If one decimal part overflows, the integer part will be automatically incremented:
EXX ;Using auxiliary registers.3) Add our gfx pointer to NEXT
We know the byte number that we want to display (NEXT), but this value is relative to the beginning of the current line of the gfx. We have it in HL (of the first register set). Warning, only the integer part of NEXT will be useful! So we use A to transfer it in the "gfx" registers:
LD A,HIt isn't over. Something's wrong with our registers. HL holds the address of the first byte of the line, but if we add NEXT to it, it won't anymore, which will be problematic for the next iterations! What we need is always keep somewhere the address of the beginning of the current line. We could do something like that, using the memory:
LD HL,(ADGFX)But as we have one 16-bit register left (BC) and clever as we are, we will set BC with the address of the beginning of the line, then add BC to HL... in which we have transferred NEXT first! As every 16-bit addition gives the result in HL, BC is intact and can be used again indefinitely. Cool! This would do this:
LD L,A ;Transfer the integer part of NEXT into HL.(By thinking a bit more we could optimise all this by making NEXT a relative value to the previously read byte. Don't forget to reset the integer part of NEXT at each iteration!)
Here we are ! Now we just have to take care of the loop.
4) The loop
A little question: should the displaying stops when a defined number of bytes are displayed on the screen, or when only a defined number of bytes from the original bytes are read? These two approaches are a bit different, each has its particularities. I chose the first one, it's the most obvious and most used. This way, the displayed gfx has always the same size whatever zoom rate you're using, and the same machine-time is used. On the coding side, a simple loop is enough. I use IXL, which is the less significant register of IX (warning, Dams and Maxam don't know such register. You will have to enter the opcode by hand. Winape knows, though).
DEC IXLTo optimise the loop, you can copy and paste the code several times, and divide the counter accordingly. Thanks to its little size, it's not memory consuming, and the machine time saved is quite satisfying. Another technique is the use of a RET table. It only takes 3 cycles (the RET instruction), and a bit of memory.
The second approach consists in always displaying the same part of the original gfx, regardless of the zoom rate. The machine time used isn't stable and directly proportional to the zoom rate, and thus, the size displayed. One way to test the end of the line is to refer not to a loop counter, but to NEXT. Imagine you have a ZOOM=1 and that your gfx is 20 bytes width. NEXT will start from 0 and will rise up to 19 at the end of one line. Now, for the same gfx, if your ZOOM is 0.128, NEXT will still go from 0 to 19.0 at the end of the line, but for 40 bytes displayed this time. So comparing the integer part of NEXT to the size of your gfx works fine.
However, if you reduce the zoom, you will notice garbage on the right side of your gfx! This is normal as you've just displayed less bytes that before. To avoid this, you can either clear this area by yourself, or add a little blank column inside your gfx, with a width proportional to the maximum speed of your un-zoom.
So all you have to do is replace the loop just above with this:
CP 19And that's it! Perhaps the "JR C" puzzles you, as you would have written "JR NZ". JR C means "inferior to". What?
Reminder: When you are writing "CP B", the Z80 does A-B. If A is smaller than B, and the subtraction overflows and the Carry is set.
But in our case, why use C and not NZ ? Simply because nothing guarantees us that 19 will be actually reached by NEXT! Perhaps the zoom rate will be too high, and NEXT will go from 18 to 20 (or the opposite) without even reaching 19! Try replace C with NZ, you will see funny things going on.
IS THAT IT?
Almost, but not yet. We have to deal with the image. As you know, we work with a byte accuracy, which means that if our image is in MODE 1, we have to stretch it till it is four times bigger, so that we can see all its pixels. I chose to simply convert a little text written with the #BB5A vector into a four times bigger gfx (in the "TRANSF" routine). The code is very simple, treats each pixel of the original gfx and converts them into bytes. As it is in Mode 1, I isolate each pixels (four per byte) with a AND, shift them so that they are put on the right position. Four comparisons (maximum) then convert this into a mode 1 byte.
As the loop technique I use during the zoom is done by counting the displayed bytes, I convert a bigger gfx than what I actually created, so that my code displays "blank" at the right of my zoom, even when the un-zoom rate is high. But if you un-zoom too much, you will see the gfx cycle!
I hope you enjoyed this little article. The source provided is quite raw, but works fine. There are a lot, lot, LOT of things to improve, like the zoom in Y, the pixel management, as well as all the optimisations you will imagine!
I hope you've learnt a thing or two. If you have any questions, don't hesitate to contact me!