The Rise of Big Data

I was helping a MongoDB user with sharding one time. His chunks weren’t splitting and I was trying to diagnose the issue. His shard key looked reasonable, he didn’t have any errors in his log, and manually splitting the chunks worked. Finally, I looked at how much data he was storing: only a few MB per chunk. “Oh, I see the problem,” I told him. “It looks like your chunks are too small to split, you just need more data.”

“No, my data is huge, enormous.” he said.

“Um, okay. If you keep inserting data, it should split.”

“This is a bug. My data is big.”

We argued back and forth a bit, but I managed to back off from having called his data small and convince him it wasn’t a bug. That day I learned that people take their data size very personally.

kristina chodorow's blog