I fought through this problem off and on for a couple days. My company has 1000 blank Arduino Ethernet's coming in and I didn't enjoy the thought of programming them with ArduinoAsISP - since I guesstimated I would be looking at about 17hrs to program them.
I finally came to the same conclusion that is was a fuse setting relative to clock speed. My working theory was that the default setting on blank chips was using a different speed than the CLOCKSPEED_FUSES value in the optiboot.h file - thus causing the fuse setting stage to fail, preventing the rest of the programming process from working. The optiboot.h file was set to #define CLOCKSPEED_FUSES SPI_CLOCK_DIV128. There is a comment in there about fiddling with this speed but the comment about the internal crystal threw me off because our boards have external crystals.
At any rate, in desperation I changed it to #define CLOCKSPEED_FUSES SPI_CLOCK_DIV64
and now the fuses get programmed correctly allowing the standalone programmer to work as expected. I would love for someone to explain to me the clock math reason that makes it work. The default fuse setting on these chips is 0x62 (01100010)
I have to assume that the reason the standalone programmer worked on previously programmed parts is that the low fuse was already set to match the actual clock.
I also changed the image to the latest version of optiboot (5.0) and all is well. Next up is to use this code to then load blink code to give a quick visual indication that the micro is alive.
Now I should be able to program a 1000 arduino ethernet's in 1.5hrs or so.
Hope this fix saves others some frustration.