- Jun 09, 2017
-
-
faiface authored
-
Michal Štrba authored
revised performance tuning pull request
-
Seebs authored
The computation including a call to Stride() can't be optimized away safely because the compiler can't tell that Stride() is effectively constant, but we know it won't change so we can make a slice pointing at that part of the array. CPU time for updateData goes from 26.35% to 18.65% in my test case.
-
Seebs authored
A slice of points means copying every point into the slice, then copying every point's data from the slice to TrianglesData. An array of indicies lets the compiler make better choices.
-
Seebs authored
For polyline, don't compute each normal twice; when we're going through a line, the "next" normal for segment N is always the "previous" normal for segment N+1, and we can compute fewer of them.
-
Seebs authored
For internal operations (anything using getAndClearPoints), there's a pretty good chance that the operation will repeatedly invoke something like fillPolygon(), meaning that it needs to push "a few" points and then invoke something that uses those points. So, we add a slice for containing spare slices of points, and on the way out of each such function, shove the current imd.points (as used inside that function) onto a stack, and set imd.points to [0:0] of the thing it was called with. Performance goes from 11-13fps to 17-18fps on my test case.
-
Seebs authored
It turns out that affine matrices are much simpler than the 3x3 matrices they imply, and we can use this to dramatically streamline some code. For a test program, this was about a 50% gain in frame rate just from the cost of the applyMatrixAndMask calls in imdraw, which were calling matrix.Project() many times. Simplifying matrix.Project, alone, got a nearly 50% frame rate boost! Also modify pixelgl's SetMatrix to copy the six values of a 3x2 Affine into the corresponding locations of a 3x3 matrix.
-
- Jun 08, 2017
-
-
Seebs authored
Removing the call to Alpha(1) and replacing it with an inline definition produces measurable improvements. Replacing each instance of ZV with Vec{} further improves things. We keep an inline RGBA because there are circumstances (mostly when using pictures) where we don't want to have to set colors to get default behavior. For a fairly triangle-heavy thing, this reduces time spent in SetLen from something over 10% of execution time to around 2.5% of execution time.
-
- May 30, 2017
- May 28, 2017
- May 27, 2017
- May 26, 2017
- May 25, 2017
-
-
faiface authored
-
Michal Štrba authored
-
faiface authored
-
- May 24, 2017
- May 23, 2017
- May 21, 2017