Usefull code snippets
France krusty - 26 July 2012 - 22:39:38 326 posts
Waiting loop

Use this thread to share interesting code.

I start it with a code from Rhino which eased my coding a lot...
This is a version for vasm, the original one must be found with the link



;;
; Macro to wait several cycles
; Maximum is 1024
;
; Stolen from Rhino/Batman Group
; http://cpcrulez.fr/forum/viewtopic.php?p=15827#p15827



MACRO WAIT_CYCLES, cycles

assert \cycles <= 1024, 'Too many nops'

\@loops equ (\cycles-1)/4
\@loopsx4 equ \@loops*4
\@nops equ \cycles-\@loopsx4-1

ld b, \@loops
.\@change_waitLoop
djnz .\@change_waitLoop

defs \@nops,0

endmacro

Germany TFM - 14 August 2012 - 23:51:11 146 posts
not "stolen to...", please use "stolen from..." ;-)
France krusty - 15 August 2012 - 12:06:03 326 posts
@tfm: corrected
France krusty - 18 August 2012 - 15:58:16 326 posts
Small modification of the waiting macro if the number of nops is inferior to 64


MACRO WAIT_CYCLES, cycles

assert \cycles <= 1024, 'Too many nops'
assert \cycles > 0, 'Wait time must be positive'

; Compute the number of loops and extra nop
.\@loops equ (\cycles-1)/4
.\@loopsx4 equ .\@loops*4
.\@nops equ \cycles-.\@loopsx4-1

; Produce a loop only if required
if .\@loops != 0
ld b, .\@loops
.\@change_waitLoop
djnz .\@change_waitLoop
endif

; Produce extra nops
defs .\@nops,0

endmacro
Tortuga Grim - 26 November 2012 - 21:54:08 521 posts
;;Signed 16x16 multiplication, 32 bits result
;;Grim/SML
;;
;;Cycles: around 333/427µs (not measured very thoroughly)
;;Size : 28 bytes
;;
;;Algo:
;; Booth's algorithm
;; http://en.wikipedia.org/wiki/Booth's_multiplication_algorithm
;;
;;Caveat:
;; Epic-fail if called with BC=&8000
;;
;;Input:
;; BC = Multiplicand
;; DE = Multiplier
;;Output:
;; HLDE = DE*BC
;; AF,HL,DE modified

mul16s:
ld hl,0
ld a,16
or a
_mul16s
jr nc,$+3
add hl,bc

bit 0,e
jr z,$+5
or a
sbc hl,bc

sra h
rr l
rr d
rr e
dec a
jr nz,_mul16s
ret
For extra speed, unroll and remove the first jr nc,$+3:add hl,bc
Tortuga Grim - 27 November 2012 - 16:34:26 521 posts
;;Signed 8x8 multiplication, 16 bits result
;;Grim/SML
;;
;;Cycles: 139µs (constant)
;;Size : 19 bytes
;;
;;Algo:
;; Booth's algorithm
;; http://en.wikipedia.org/wiki/Booth's_multiplication_algorithm
;;
;;Caveat:
;; Epic-fail if called with H=&80 (ie. -128)
;;
;;Input:
;; H = Multiplier
;; L = Multiplicand
;;Output:
;; HL = H*L
;; A = high(H*L)
;; B, Flags modified

mul8s:
ld b,8
xor a
_mul8s
jr nc,$+3
add a,h

bit 0,l
jr z,$+3
sub a,h

sra a
rr l
djnz _mul8s
ld h,a
ret
For extra speed, unroll and remove the first jr nc,$+3:add a,h
Tortuga Grim - 28 November 2012 - 00:49:33 521 posts
Macro for WinAPE/Maxam assembler.
;; Wait N number of CPU-Cycle
;; Input parameter:
;; 1 = Number (n) of cycles to wait
;; For n=[0,127], output N NOPs (no CPU-register modified)
;; For n=[128,1024], output a short DJNZ loop (register B and Flags are modified)
;; For n=[1025,32766], output a big DJNZ loop (registers BC and Flags are modified)
;; For n>32766, output is 42
macro GNOP n
LET gnop_maxnop = 128 ;<= can be adjusted to any value greater than 3
LET gnop_n = n
ifnot n - 1025 AND &8000
; GigaLoop with BC
LET gnop_n = n - 6
LET gnop_t = 256*4-1 + 4
LET gnop_c = gnop_n - 3 / gnop_t
LET gnop_b = -gnop_t*gnop_c + gnop_n / 4
LET gnop_t = gnop_t*gnop_c
LET gnop_t = 4*gnop_b - 1 + gnop_t
LET gnop_n = gnop_n - gnop_t

ld bc,gnop_b*256 + gnop_c + 1
@gnop djnz $
dec c
jr nz,@gnop

else
ifnot n - gnop_maxnop AND &8000
; MiniLoop with B
LET gnop_b = n-1 / 4
LET gnop_n = -gnop_b*4 - 1 + n

ld b,gnop_b
djnz $
endif
endif
ds gnop_n,0
mend
Usage:
	GNOP 6128 ; Wait for 6128µs with a BC Loop
Tortuga Grim - 28 November 2012 - 18:49:11 521 posts
;;Test all the 64k-pages (0 to 7) of extended RAM with standard MMR.
;;(does not support exotic memory expansions using additional MMR registers,
;;only TFM cares about that :)
;;
;;Tip from Capt'n Obvious: do not assemble within &4000-&7FFF, Aye'
;;
;;Output
;; C = bit-map of available 64k page(s).
;; A bit is set when all four 16k banks of the 64k page were found.
;;
;; C.bit0 = page 0 (eg. 6128)
;; C.bit1 = page 1
;; ...
;;
;; One byte in all the available extended memory banks has been modified.
;; (that should not be a problem, sane people do not use extended RAM
;; before testing if it's available or not)
;;
;; HL,D,BC,AF modified
;; MMR is reset to the base 64k (no bank left paged in)
;;
;;MightDoLater:
;; Check if special paging configurations (eg. &C1-&C3) are available.

checkRAM:
ld hl,&51C0
ld bc,&7F04
out (c),l
ld d,(hl)
; First pass, tag all possible banks
ld (hl),c
ld a,&FF
_chkRAM_tagBank out (c),a
ld (hl),a
dec a
bit 2,a
jr nz,_chkRAM_tagBank
sub c
bit 6,a
jr nz,_chkRAM_tagBank
; Second pass, check what's really there
ld a,&FF
_chkRAM_chkBank out (c),a
cp (hl)
jr nz,_chkRAM_skpPage
dec a
bit 2,a
jr nz,_chkRAM_chkBank
; All four banks passed OK
; Flag the 64k page as available
scf
_chkRAM_skpPage rl c
and %11111000
dec a
bit 6,a
jr nz,_chkRAM_chkBank
; Restore modified byte in base RAM
out (c),l
ld (hl),d

ret
Usage:
	; Check the expansion RAM
call checkRAM

; Say this program wants to use the 64k page 0
; (eg. the one available on a 6128)
; So we only care about C.bit0
ld a,%00000001
and c
jr nz,_RAM_OK
; Uh!Oh! Page 0 is not available!
; Display ponies, Nyan cat or both until
; the user upgrades or dies.
ponies jr ponies

_RAM_OK
; 64k Page 0 available, great!
; Proceed...
Germany TFM - 28 November 2012 - 23:05:17 146 posts
;-)
Tortuga Grim - 29 November 2012 - 18:23:59 521 posts
Facts:
* The original CPC has no reset-button.
* Most democoders bragging about doing stuff only for the original hardware rely on expansions or custom-mods with a reset-button to exit their productions.
* Real users of original hardware have to switch OFF/ON their CPC after watching a demo.
* Often switching ON/OFF a CPC is harsh for the already aging electronic parts.

Truth revealed:
* CPC Democoders aim to destroy the original hardware!

;; ESC-Key test
;;
;; Size : 36 bytes
;; Time : 53 µs
;;
;; Output
;; Z is set if ESC is pressed
;; BC, AF are modified

macro out0
dw &71ED
mend

EscApe
ld bc,&F40E
out (c),c
ld bc,&F6C0
out (c),c
out0
ld bc,&F792
out (c),c
ld bc,&F648
out (c),c
ld a,&F4
in a,(0)
out0
ld bc,&F782
out (c),c
bit 2,a
ret
Usage:
	call EscApe
jp z,_banana

; ...


;; Banana reset routine
_banana
; If you messed up with the CRTC, put here your seamless transition
; to Firmware default configuration.

; If you messed up with ASIC features, clean up your mess here.

di
ld hl,_rst0
ld bc,5
ld d,b
ld e,b
ldir
rst 0
_rst0
ld bc,&7F89
out (c),c
BANANA!
Germany TFM - 29 November 2012 - 23:40:55 146 posts
What's about:

_banana

ld bc,&7f80 ;Select lower ROM, stay in Mode 0
out (c),c ;bank lower ROM in
rst 0 ;jump to address 0000, firmware boots...


That shall always work
Tortuga Grim - 30 November 2012 - 00:08:22 521 posts
If you do that with your routine located within &0001-&3FFF, the RST 0 won't be executed because the CPU will fetch it's next opcode off the Lower ROM that just got paged-in. I wouldn't call that a soft-reset, but something more like a Russian-roulette :)
Germany TFM - 30 November 2012 - 03:53:19 146 posts
Haha. Right, it must be between &4000-&7FFF. But ... &0000-&3FFF and &C000-&FFFF ... are .... IMHO... screen RAM ;-)

Tortuga Grim - 08 February 2013 - 17:42:21 521 posts
;; Automatically select Absolute or Relative jump instruction
;; depending on the jump distance.
;;
;; Tip from Capt'n Obvious:
;; Use self-modifying code on these jumps at your own risks, Aye'
;;
;; WinAPE macros
macro _getJmpMode address
let _varJmpMode = $ - address + 129 AND &FF00
mend
macro jmp address
_getJmpMode address
if _varJmpMode
jp address
else
jr address
endif
mend
macro jz address
_getJmpMode address
if _varJmpMode
jp z,address
else
jr z,address
endif
mend
macro jnz address
_getJmpMode address
if _varJmpMode
jp nz,address
else
jr nz,address
endif
mend
macro jc address
_getJmpMode address
if _varJmpMode
jp c,address
else
jr c,address
endif
mend
macro jnc address
_getJmpMode address
if _varJmpMode
jp nc,address
else
jr nc,address
endif
mend
Usage:
calvin	equ &3fa0
hobbes equ &8000

org &4000
loop
or a
jz calvin ; within relative range, JR selected

add a,51
jc hobbes ; outside relative range, JP selected

jmp loop ; within relative range, JR selected

Nobody else has nice code snippets to share? :(
France Toms - 14 February 2013 - 13:29:34 272 posts
Grimmy: I just read the Booth's multiplication algorithm and I spotted a bug into your signed 8x8 multiplication routine. Here are the corrected lines:

bit 0,L ; instead of bit 0,A
jr Z,$+3 ; instead of jr NZ,$+3
sub a,h

Assuming that the two least significant bits of P are L register's least significant bit and Carry flag.

It works better like this. Am I right? :)
France PulkoMandy - 14 February 2013 - 18:26:43 633 posts

Nobody else has nice code snippets to share? :(



I'm working on C code runing on a 6809 CPU. Do you think you have an use for it ?
http://pulkomandy.tk/projects/thomson/browser

It works better like this. Am I right? :)


All code snippets in this threads are DRM protected to make sure you understand them before using them ;)
France krusty - 14 February 2013 - 18:32:54 326 posts

I'm working on C code runing on a 6809 CPU. Do you think you have an use for it ?




and what about your part for the MD?:-)
Tortuga Grim - 14 February 2013 - 18:44:52 521 posts
@Toms: Duh! Yeah! It seems my copy/paste skill sucks, I copied a bad version. (I fixed it) But you win an extra code snippet! (DRM-free... hopefully :)

;; 8-bit PRNG (Galois LFSR)
;; Period : 255
;; Size : 11 bytes
;; Time : 13 / 14 µs (min/max)
;;
;; Output:
;; A = Random number
;; Flags are modified

rnd8:
_rnd8_var_seed EQU $+1
ld a,51 ; Seed
add a,a
jr nc,_rnd8
xor %00011101 ; Tap
_rnd8 ld (_rnd8_var_seed),a
ret

;; Initialize seed value of rnd8.
;; WARNING: Takes a lot of time!
seed8:
ld a,r
ld hl,0
_seed8
add a,(hl)
inc l
jr nz,_seed8
inc h
jr nz,_seed8
; check seed is > 0
or a
; if we generated a null seed,
; we let the default seed value
ret z
; otherwise, we modify it
ld (_rnd8_var_seed),a
ret
Usage:
		; This is somewhere at the beginning of your program.
; The more "random" the content in memory is
; (eg. Firmware/BASIC variables), the better.
call seed8
; ...

; This is somewhere in your program.
; Get a random number from 0 to 255 in A
call rnd8

@PulkoMandy: C on 6809, dont care much. (I hate 6809 :) And posting code-snippets doesnt require much effort actually, you just have to pick into your existing codebase (assuming, of course, it exists :)
France PulkoMandy - 14 February 2013 - 20:17:44 633 posts
Krusty: well, as far as I know the Forever party comes first ! The Megademo comes next...

@Grimmy: well, I'll look at my small codebase then :)
Tortuga Grim - 15 February 2013 - 06:15:07 521 posts
;; 8-bit signed magnitude approximation (2.8% accuracy)
;; m = sqr(x^2 + y^2)
;;
;; Size : 47 bytes
;; Time : 35 / 40 µs (min/max)
;;
;; Algorithm:
;; "Fast Amplitude Approximations of Quadrature Pairs" by B.K Levitt, G.A. Morris
;; http://tmo.jpl.nasa.gov/progress_report2/42-40/40L.PDF
;;
;; Input: (8-bit signed)
;; H = x
;; L = y
;;
;; Output:
;; A ~= sqr(x^2 + y^2)
;; HL and flags are modified
hypot:
; abs(x)
xor a
sub h
jp m,$+4
ld h,a
; abs(y)
xor a
sub l
jp m,$+4
ld l,a
; min/max(x,y)
ld a,l
cp h
jr c,$+5
ld l,h ; do you
ld h,a ; hear me
ld a,l ; dave?
; H = max(x,y) = max
; L = min(x,y) = min
; A = min

; Select approximation depending on (max >= 3*min) condition
; 3*min
add a,a ; min*2
add a,l ; min*2 + min = 3*min
jr c,_hypot_min
cp h
jr c,_hypot_max
_hypot_min
; Approximation with m = (min/2) + max*(7/8)
; min/2
ld a,l
srl a
; max*7/8 = (max + max*2 + max*4) / 8
; = max/8 + max/4 + max/2
srl h
add a,h ; + max/2
srl h
add a,h ; + max/4
srl h
add a,h ; + max/8
ret

_hypot_max ; Approximation with m = (min/8) + max
ld a,l
srl a ; min/2
srl a ; min/4
srl a ; min/8
add a,h ; min/8 + max
ret
@Toms: I thought you might like this one better :)
Germany Apollo - 20 February 2013 - 20:18:11 36 posts
@Grim: You are an incredible source of knowledge and new ideas! I want to thank you very much for sharing this interesting code snippets!!
I was working on a approximation of sqrt(x^2+y^2) myself without knowing of that paper you linked.
I hope I can contribute one day as well!

By the way, any chance you get some time and mood to finish your article about the CRTC and GA any time? *blinking with the eyes*
France CloudStrife - 20 February 2013 - 20:42:32 161 posts
Grim: in a bit less precise but simpler, you also have the alpha*max + beta*min, it give an accuracy better than 7% with alpha=1 and beta=3/8...

Edit: well, need to open my eyes, they speak about it in the paper...
Tortuga Grim - 27 February 2013 - 05:45:44 521 posts
@Apollo: CRTC and GA are so booooring! Would you not prefer an atan2 approximation routine instead by any chance? :)

@Cloudstrife: yep, this is just an improved amax+bmin with two coefficient sets instead of one. It takes a bit longer to execute, but the gain in accuracy is worth a handful of microseconds more imo.
France Toms - 01 March 2013 - 12:19:34 272 posts
Grimmy: an atan2 approximation would be a very appreciated late birthday gift :)
France CloudStrife - 01 March 2013 - 15:09:31 161 posts
Happy birthday Toms !

;; A very high speed atan2 approximation
;; Input:
;; A, F, D, H, IX... Don't care..
;; Ouput:
;; A ~= atan2(x, y)
;; Flags are not modified
atan2: ld a,0
ret
Croatia bonefish - 01 March 2013 - 23:41:29 40 posts
:)))))))))))
Tortuga Grim - 02 March 2013 - 17:21:38 521 posts
@Cloudstrife: Flags are not modified :)
France CloudStrife - 02 March 2013 - 19:50:53 161 posts
Oups :)
And now with a better approximation on the [-128;127] range for x and y !
France PulkoMandy - 02 March 2013 - 22:44:34 633 posts
I can do it faster !


atan2: xor a
ret
Tortuga Grim - 02 March 2013 - 23:06:10 521 posts
@PulkoMandy: But flags are modified :)
Page: 1 2 next last
Online: nobody
Kill All Humans!