Ridley5 opened this issue on Jul 26, 2010 · 1724 posts
odf posted Tue, 10 August 2010 at 8:32 PM Online Now!
Quote - > Quote - > Quote - A classic way to calculate normals as Poser4 does, there are other ways to do it with different visual results.
The code is in C ( I use Begin for { and End for }).
I don't see any problem with speed for this, the only part that is worth to optimize is NormalizeVector that requires division and square root that is slow. I can be done in Assembler with SSE and it will be very fast..Yep, that's almost precisely how I'm doing it, just vectorized via Numeric. Except that you're using a trick to save a few subtractions when computing the face normals that I had forgotten about.
Look:
def compute_normals(self):<br></br> verts = self.verts<br></br> polys = self.geom.Polygons()<br></br> sets = self.geom.Sets()<br></br><br></br> normals = num.zeros([self.nr_verts, 3], "double")<br></br><br></br> for i, p in enumerate(polys):<br></br> nv = p.NumVertices()<br></br> start = p.Start()<br></br><br></br> indices = sets[start : start + nv]<br></br><br></br> points = num.take(verts, indices)<br></br> s = points - num.take(verts, rotate_rows(indices, 1))<br></br> t = rotate_rows(s, -1)<br></br> n = rotate_rows(num.sum(s * rotate_columns(t, -1)<br></br> - t * rotate_columns(s, -1)), -1)<br></br> n = n / num.sqrt(num.dot(n, n))<br></br><br></br> for v in indices:<br></br> normals[v] += n<br></br><br></br> self.normals = normalize_rows(normals)<br></br>
:laugh:
Slick. I figured the numeric library would make this fast and easy.
You may be reaching a point where the C calls you're making are a tiny fraction of the time and it takes a while just to find them.
By that I mean, for example, on each iteration of the loop, you're doing:
lookup "num" in globals 4 times
lookup (in num) "take" 2 times
lookup (in num) "sum"
lookup (in num) "dot"
lookup (in num) "sqrt"
lookup rotate_rows in globals 3 times
lookup rotate_columns in globals 2 timesThat's 14 object lookups on each iteration. All of them are constants - they don't change.
So you can probably get another speedup by factoring those out before the loop into locals.
l_take = num.take
l_sum = num.sum
l_rotate_rows = rotate_rowsetc. and then just use them instead of doing all those lookups thousands and thousands of times.
Nice work.
Thanks! I'll try your suggestion, but honestly, I'm not overly optimistic about it. In all the years I've been programming in Python, I don't recall ever having gained a significant speed-up from that kind of optimization, not even in inner loops that were executed tens of thousands of times like here. Your mileage may vary, of course.
But I've noticed that I'm doing
points = num.take(verts, indices)<br></br>
s = points - num.take(verts, rotate_rows(indices, 1))<br></br><br></br>
which should be equivalent to
<br></br>
points = num.take(verts, indices)
s = points - rotate_rows(points, 1))
-- I'm not mad at you, just Westphalian.