Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Nesticle/Nesticle95 did not required patches to NES games

NES ROMs were distributed pre-patched to work on Nesticle. I was there for it. This was not done by Bloodlust, but by people wanting to play the games.

Some emulators to this day ship with databases of checksums to detect these images.

> IMO, a SNES emulator requiring 3GHz means brain-dead implementation

I look forward to your SNES emulator with no known bugs and much lower system requirements. And if you aren't making one, why are you so sure of what is required? None of the other SNES emulator authors who have looked at my code feel there is a technical flaw in its implementation.

You need to understand the exponential function to understand bsnes' system requirements. When I first started out, I was able to run ~99% of games almost as quickly as Snes9X (the latter used ASM cores for the CPU, SA1 and SFX.) The more accurate you get, the less it fixes. To get everything running at the same time is the hard part. It means you can't take shortcuts anywhere.

If I chose to remove behaviors that would break at best a dozen games, my emulator could run four times as fast. In fact, I've made a build that does just that. Here it is getting 80fps on an in-order Atom CPU: http://imageshack.us/photo/my-images/405/zelda3v.png/ -- this build pushes 300-400fps on my 3GHz E8400.



In my opinion, is slow. And I bet you -or others- have room for increase performance by 2-4x in next years. Even without SIMD intrinsics (autovectorization-friendly code is OK, though), but with proper data structure usage, better code L1 cache usage, increasing data cache hit ratio, reducing function/pointer indirection, reduce data cache pollution, etc.


I wrote an article on this here: http://byuu.org/articles/optimization

Executive summary: yes, I could probably get it twice as fast with identical accuracy (in the absolute best case -- 50% faster is more realistic); but I would sacrifice the portability, maintainability and readability of the code. This is a big problem because I have very limited time and resources as a single person. It's because the code is so straight-forward that I'm able to rapidly fix bugs when they are found.


First of all, your proyect is OK as is, I respect your criteria, of course.

My point in that regard was about the 3GHz requirement was because of the implementation, mainly, not because accuracy requirement. I bet 2-4x faster is possible, without reducing portability. If for "maintainability and readability" you mean keep it in C++ (L1 code cache impact, indirections, data structure penalty, etc.), I agree, that will be a penalty as fix cost in time and space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: