@@ -206,17 +206,17 @@ puts(utf8.c_str());
206206
207207...the `convert.to_bytes()` only passes a `char16_t *` parameter to the method. **No string length
208208parameter was provided**. In other words, my "hunch" seemed to be correct; the UTF-16-to-UTF-8
209- conversion code seemed to expect a _NUL -terminated string_. I checked and
209+ conversion code seemed to expect a _`NUL` -terminated string_. I checked and
210210[cppreference.com](https://en.cppreference.com/w/cpp/locale/wstring_convert/to_bytes.html) confirmed
211- that the method was indeed expecting a NUL-terminated string.
211+ that the method was indeed expecting a ` NUL` -terminated string.
212212
213213And that's precisely what it no longer would be getting. Because I was now shrinking the UTF-16
214214string to the exact amount of `uint16_t` elements, the string would no longer contain any "extra",
215215default-initialized (zero) elements at the end => attempting to print it would trigger the
216- undesirable _undefined behaviour_ we were seeing above. Sometimes (FreeBSD), printing
217- garbage characters. Sometimes (OpenBSD), triggering an error in the C++ standard library. And
218- sometimes (and this is perhaps the _worst one of them all_), **working just like I had intended it
219- to work**, printing only the "this is a an ASCII string" content.
216+ undesirable _undefined behaviour_ we were seeing above. Sometimes (FreeBSD), printing garbage
217+ characters. Sometimes (OpenBSD), triggering an error in the C++ standard library. And sometimes (and
218+ this is perhaps the _most deceitful one of them all_), **working just like I had intended it to
219+ work**, printing only the "this is a an ASCII string" content - despite my coding error!
220220
221221Where is this log, you say? It's [actually
222222there](https://gitlab.perlang.org/perlang/perlang/-/jobs/3191), but I haven't shown it to you yet.
@@ -264,16 +264,16 @@ index 98785a2..6536505 100644
264264 }
265265```
266266
267- ...and we are back on the safe (NUL-terminating) side. The second hunk above is the important one;
268- it makes sure to reserve exactly one character extra for the NUL terminator. I don't plan to let
269- UTF-16 strings in Perlang be NUL terminated in the long run, but for now, it'll be good enough.
267+ ...and we are back on the safe (` NUL ` -terminating) side. The second hunk above is the important one;
268+ it makes sure to reserve exactly one character extra for the ` NUL ` terminator. I don't plan to let
269+ UTF-16 strings in Perlang be ` NUL ` terminated in the long run, but for now, it'll be good enough.
270270
271271## The moral of the story
272272
273- - ** You need automated testing** . That I even have to write this is a bit tragic , but there are
273+ - ** You need automated testing** . That I even have to write this is a bit amazing , but there are
274274 still people arguing that automated testing is a waste of time. Suffice to say, if I didn't have
275- automated tests in this case, I could very well have merged in broken code this time , without even
276- being aware of it. At the very least, I would have realized the breakage when starting to port the
275+ automated tests in this case, I could very well have merged in broken code, without even being
276+ aware of it. At the very least, I would have realized the breakage when starting to port the
277277 Perlang compiler to FreeBSD or OpenBSD, which brings us to the next point...
278278
279279- ** Testing on multiple platforms is good** . I know, this isn't always easily doable. I have the
@@ -285,9 +285,9 @@ UTF-16 strings in Perlang be NUL terminated in the long run, but for now, it'll
285285 support.
286286
287287- ** You need Valgrind** . Well, maybe you don't. This largely depends on what kind of project you're
288- working, but _ if_ you are working with a language with manual memory management (like C, C++, Zig ,
289- Odin), chances are that [ Valgrind] ( (https://en.wikipedia.org/wiki/Valgrind) ) will be useful to
290- you. It helps you find common errors like the one's we saw above (reading outside an allocated
288+ working on , but _ if_ you are working with a language with manual memory management (like C, C++,
289+ Zig, Odin), chances are that [ Valgrind] ( (https://en.wikipedia.org/wiki/Valgrind) ) will be useful
290+ to you. It helps you find common errors like the one's we saw above (reading outside an allocated
291291 buffer), use-after-free, double-free and memory leaks. It can also help you spot potential bad
292292 patterns like mixing ` malloc ` and ` delete ` (or ` new ` and ` free ` ) in C++. I think there's a bunch
293293 of other checks it can help you with too, but the ` memcheck ` stuff is the one I have experience
0 commit comments