1
0
mirror of https://github.com/blawar/GLideN64.git synced 2024-07-04 10:03:36 +00:00
Commit Graph

7 Commits

Author SHA1 Message Date
gizmo98
b8a18f57e4 arm neon: add faster versions of InverseTransformVectorNormalize 2x and 4x
2x, opt level -O2
old 0,42 runtime of c function
new 0,36 runtime of c function

4x, opt level -O2
old 0,30 runtime of c function
new 0,23 runtime of c function
2017-04-05 10:40:01 +07:00
gizmo98
2acc3c7775 arm neon: add multivector versions of InverseTransformVectorNormalize
Arm neon performance is much better if more data can be loaded and
processed.

Four vectors. Opt level -O3
—————————————-
runtime 100% - C function
runtime 99% - Neon function
runtime 56% - Neon 2x function
runtime 36% - Neon 4x function

Four vectors. Opt level -O2
—————————————-
runtime 100% - C function
runtime 71% - Neon function
runtime 43% - Neon 2x function
runtime 30% - Neon 4x function
2017-03-27 14:22:49 +07:00
gizmo98
f907706dae arm neon: remove DotProduct
Compared to C function DotProduct runs slower.
-O0 factor 0,86
-O1 factor 1,60
-O2 factor 1,59
-O3 factor 1,57
Six values and 3x mult/add is not enough workload to fill at least two
quads and hide neon latency.
2017-03-24 18:36:38 +01:00
gizmo98
637633ae5d arm asm: add MultMatrix2 without memcpy
Unlike the C function of MultMatrix neon asm writes m0 only after
calculation.
2017-03-18 19:49:33 +07:00
Sergey Lipskiy
6af6e2c17f Rewrite lighting.
Fixed chopper attack wrong textures #99
Thanks Gillou68310 for detection of the problem's origin.
2016-11-26 19:31:50 +07:00
Francisco Zurita
7386a036ce Port glN64 for Android NEON optimizations 2016-07-06 11:00:40 +06:00
Sergey Lipskiy
52d68d1389 Move all sources to src folder. 2015-05-13 10:21:32 +06:00