Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe that on several modern compilers, memcpy and memset are practically -if not literally- treated as intrinsics. As in, the compiler has been granted semantic understanding of those specific functions and can generate very well optimized assembly based on that knowledge. Haven't heard about the same being done for memmove.


They have got so good at recognising hand written memcpy implementations and replacing with builtins that it is almost becoming impossible to actually write a libc any more as memcpy gets replaced with a pointer to memcpy [1].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888


What compilers are you thinking about? I've never seen GCC inlining and specializing a call to memcpy, it just calls the generic (but optimized) implementation in the standard library.



Yeah, you were right. I've checked and it seems the GCC version I'm using is inlining memcpy calls if it can statically determine the size argument and if the size is lower or equal to 64 bytes.


I take it that you are a long-time C programmer who hasn't realized that the language is no longer the one you grew up loving. That casting a float to an int to do some bit twiddling is now undefined behavior of the sort that technically entitles the compiler have it's way with your root filesystem. Although usually it will just let you off with a warning and a gaping hole where your null checking code used to be.

Believe it or not, memcpy() is now the only portable and properly defined way to cast one type to another in C. You are expected to use it with the explicit intent that no bytes actually be copied, simply as a (completely voluntary) offering to the type gods. If they accept your offering, the call will just disappear from your code, and never be made. If you fail to make this offering, they feel entitled to excise an equal quantity of other code to punish you.

Sure, you the programmer happen to know that bits are bits, and that on your system they are already in the register you want to use, but the compiler stopped playing that game years ago. If you wanted to work close to the metal, you should have chosen a language more appropriate for the task. There's a great discussion of the issues in this epic comp.arch thread: http://compgroups.net/comp.arch/if-it-were-easy/2993157

Search for the first occurence of 'memcpy', where you'll find a polite but beleaguered Terje Mathisen asking for the best way to portably cast a float to an integer in C. Then keep searching forward for further occurrences of memcpy as the situation becomes surreal, with a GCC maintainer Mike Stump eventually clearing things up:

  >>> So what is the blessed method?
  >>
  >> Just memcpy it, simple sweet, fast, standard.
  >
  >So what you are saying is that memcpy() isn't just magic but high magic:

  No.  What I think I'm saying is that it is standard.  See the quoted
  text above.  The word magic [ checking ] doesn't not occur in c99-tc3.
  What is defined in that standard is memcpy, it is as standard as if,
  which is also defined.

  >It looks like a function call but can be whatever the compiler wants it 
  >to be, as long as the results behave as if the data was actually copied.
  >:-)

  No, you fail to grasp the totality of the standard.   The implementation 
  is free to do _anything_ it wants.  The only constraint is that the user 
  can't figure that it deviated from the required semantics by using a standards
  defined mechanism for figuring it out.  We call this the as-if rule, and its
  power is awesome; we could destroy the universe, and repiece it back together
  one subatomic particle at a time over a billion years, and still be compliant,
  if we wanted.
So there you have it: if you want to reuse the exact contents of a register as a different type in a way that you know will work, the correct approach is to use memcpy() to copy the data from one variable to another, and then hope without confirmation that GCC has optimized out the call to make it equivalent to a simple cast.

If you want a better understanding of the GCC mindest, the rest of Mike's posts are excellent: cutting, extremely techically accurate, and (in my opinion) completely missing the point of why some people are unhappy with the direction that C is evolving.


C11 explicitly blessed unions for type-punning as well as memcpy (actually, C99 TC2 did). That said, people should just use memcpy.


Yes, it's now in the standard (someone in the thread I linked pointed to TC3 draft, §6.5.2.3, footnote 82) but apparently there are issues with the way GCC implements in its default dialect that make it unsafe. More Mike Stump:

  >>> So what about using a union?
  >>
  >> Most people screw it up, the rules for it working are slightly odd.
  >> Those rules are not standard[1], rather they are gcc specific, so in very
  >
  >Ouch, so what you are saying is that this is one of those (according to 
  >Nick M) barely defined areas of the language?

  What I am saying is that it is defined to not work in the language
  standard gcc implements by default.  There is no barely, there is no
  defined.  As an extension to the language standard, gcc implements
  (defines) a few things so that users can make some non-standard things
  always work.  I say slightly odd, as the rules are just a tad harder
  than trivial.
I prefer the union approach to memcpy() because I can more easily reason about its behaviour, but veiled warnings from GCC maintainers scare me away from it. But perhaps I misunderstand the warning.

For that matter, I prefer simple casts to unions, and while I agree the spec makes it undefined, I don't yet see why all common implementations couldn't simply make it work in all reasonable cases. I'm currently trying to get used to using memcpy() for type annotation, but it still feels unnatural to write a function I don't want executed (yes, I have trouble with setters/getters too).

What I'd like is to have a way to specify to the compiler that I want it to compile the code I give it as written as best as it can, rather than optimizing it out as undefined. As it is, I often resort to inline assembly if I actually want an operation to happen. It seems like there should be an intermediate approach, a hypothetical '#pragma "dwim"' that could avoid this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: