acm-header
Sign In

Communications of the ACM

Blogroll


bg-corner

Generating arrays at compile-time in C++ with lambdas
From Daniel Lemire's Blog

Generating arrays at compile-time in C++ with lambdas

Suppose that you want to check whether a character in C++ belongs to a fixed set, such as ‘\0’, ‘\x09’, ‘\x0a’,’\x0d’, ‘ ‘, ‘#’, ‘/’, ‘:’, ‘<‘, ‘>’, ‘?’, ‘@’, ‘...

Appending to an std::string character-by-character: how does the capacity grow?
From Daniel Lemire's Blog

Appending to an std::string character-by-character: how does the capacity grow?

In C++, suppose that you append to a string one character at a time: while(my_string.size() <= 10'000'000) { my_string += "a"; } In theory, it might be possible...

For processing strings, streams in C++ can be slow
From Daniel Lemire's Blog

For processing strings, streams in C++ can be slow

The C++ library has long been organized around stream classes, at least when it comes to reading and parsing strings. But streams can be surprisingly slow. ForContinue...

How many billions of transistors in your iPhone processor?
From Daniel Lemire's Blog

How many billions of transistors in your iPhone processor?

In about 10 years, Apple has multiplied by 19 the number of transistors in its mobile processors. It corresponds roughly to a steady rate of improvement of 34%Continue...

Randomness in programming (with Go code)
From Daniel Lemire's Blog

Randomness in programming (with Go code)

Computer software is typically deterministic on paper: if you run twice the same program with the same inputs, you should get the same outputs. In practice, the...

Parsing integers quickly with AVX-512
From Daniel Lemire's Blog

Parsing integers quickly with AVX-512

If I give a programmer a string such as "9223372036854775808" and I ask them to convert it to an integer, they might do the following in C++: std::string s = .....

Transcoding Unicode strings at crazy speeds with AVX-512
From Daniel Lemire's Blog

Transcoding Unicode strings at crazy speeds with AVX-512

In software, we store strings of text as arrays of bytes in memory using one of the Unicode Transformation Formats (UTF), the most popular being UTF-8 and UTF-16...

Science and Technology links (September 2 2023)
From Daniel Lemire's Blog

Science and Technology links (September 2 2023)

Physicists have a published a paper with 5154 authors. The list of authors takes 24 pages out of the 33 pages. The lesson is that if someone tell you that theyContinue...

Transcoding Latin 1 strings to UTF-8 strings at 12 GB/s using AVX-512
From Daniel Lemire's Blog

Transcoding Latin 1 strings to UTF-8 strings at 12 GB/s using AVX-512

Though most strings online today follow the Unicode standard (e.g., using UTF-8), the Latin 1 standard is still in widespread inside some systems (such as browsers)...

Transcoding UTF-8 strings to Latin 1 strings at 12 GB/s using AVX-512
From Daniel Lemire's Blog

Transcoding UTF-8 strings to Latin 1 strings at 12 GB/s using AVX-512

Most strings online are Unicode strings in the UTF-8 format. Other systems (e.g., Java, Microsoft) might prefer UTF-16. However, Latin 1 is still a common encoding...

Coding of domain names to wire format at gigabytes per second
From Daniel Lemire's Blog

Coding of domain names to wire format at gigabytes per second

When you enter in your browser the domain name lemire.me, it eventually gets encoded into a so-called wire format. The name lemire.me contains two labels, one of...

Science and Technology links (August 6 2023)
From Daniel Lemire's Blog

Science and Technology links (August 6 2023)

In an extensive study, You et al. (2022) found that meat consumption was correlated with higher life expectancies: Meat intake is positively correlated with life...

Decoding base16 sequences quickly
From Daniel Lemire's Blog

Decoding base16 sequences quickly

We sometimes represent binary data using the hexadecimal notation. We use a base-16 representation where the first 10 digits are 0, 1, 2, 3, 5, 6, 7, 8, 9 and where...

Science and Technology links (July 23 2023)
From Daniel Lemire's Blog

Science and Technology links (July 23 2023)

People increasingly consume ultra processed foods. They include energy drinks, mass-produced packaged breads, margarines, cereal, energy bars, fruit yogurts, fruit...

Fast decoding of base32 strings
From Daniel Lemire's Blog

Fast decoding of base32 strings

We often need to encode binary data into ASCII strings. The standards (e.g., email) to do so include base16, base32 and base64. There are some research papers on...

Science and Technology links (July 16 2023)
From Daniel Lemire's Blog

Science and Technology links (July 16 2023)

Most people think that they are more intelligent than average. Lack of vitamin C may damage the arteries. Make sure you have enough! A difficult problem in software...

Recognizing string prefixes with SIMD instructions
From Daniel Lemire's Blog

Recognizing string prefixes with SIMD instructions

Suppose that I give you a long list of string tokens (e.g., “A”, “A6”, “AAAA”, “AFSDB”, “APL”, “CAA”, “CDS”, “CDNSKEY”, “CERT”, “CH”, “CNAME”, “CS”, “CSYNC”, “DHC...

Stealth, not secrecy
From Daniel Lemire's Blog

Stealth, not secrecy

The strategy for winning is simple: do good work and tell the world about it. In that order! This implies some level of stealth as you are doing the good work.Continue...

Packing a string of digits into an integer quickly
From Daniel Lemire's Blog

Packing a string of digits into an integer quickly

Suppose that I give you a short string of digits, containing possibly spaces or other characters (e.g., "20141103 012910"). We would like to pack the digits into...

Having fun with string literal suffixes in C++
From Daniel Lemire's Blog

Having fun with string literal suffixes in C++

The C++11 standard introduced used-defined string suffixes. It also added regular  expressions to the C++ language as a standard feature. I wanted to have fun and...
Sign In for Full Access
» Forgot Password? » Create an ACM Web Account