1
0
mirror of https://github.com/blawar/GLideN64.git synced 2024-07-07 03:13:49 +00:00
Commit Graph

6 Commits

Author SHA1 Message Date
gizmo98
2acc3c7775 arm neon: add multivector versions of InverseTransformVectorNormalize
Arm neon performance is much better if more data can be loaded and
processed.

Four vectors. Opt level -O3
—————————————-
runtime 100% - C function
runtime 99% - Neon function
runtime 56% - Neon 2x function
runtime 36% - Neon 4x function

Four vectors. Opt level -O2
—————————————-
runtime 100% - C function
runtime 71% - Neon function
runtime 43% - Neon 2x function
runtime 30% - Neon 4x function
2017-03-27 14:22:49 +07:00
gizmo98
f907706dae arm neon: remove DotProduct
Compared to C function DotProduct runs slower.
-O0 factor 0,86
-O1 factor 1,60
-O2 factor 1,59
-O3 factor 1,57
Six values and 3x mult/add is not enough workload to fill at least two
quads and hide neon latency.
2017-03-24 18:36:38 +01:00
gizmo98
637633ae5d arm asm: add MultMatrix2 without memcpy
Unlike the C function of MultMatrix neon asm writes m0 only after
calculation.
2017-03-18 19:49:33 +07:00
Sergey Lipskiy
6af6e2c17f Rewrite lighting.
Fixed chopper attack wrong textures #99
Thanks Gillou68310 for detection of the problem's origin.
2016-11-26 19:31:50 +07:00
Francisco Zurita
7386a036ce Port glN64 for Android NEON optimizations 2016-07-06 11:00:40 +06:00
Sergey Lipskiy
52d68d1389 Move all sources to src folder. 2015-05-13 10:21:32 +06:00