I’d like to convince you that it is rarely in your best interest to delete text for space reasons. The logic goes like this:
- Assume a very good secretary might be able to type 150 words per minute, or 90,000 words an hour.
- Let’s assume that the average word length is somehow ten letters – this is high.
- This secretary then types 90,000 words an hour, or 900,000 characters per hour. Let’s round that up to one million characters per hour.
- Now let’s say that you want to store not only the character, which takes about a byte, but also tons of other metadata – for each character, store
- The character (1 byte)
- The timestamp (8 bytes)
- The full file path (max 260 bytes in Windows)
- The user name (100 bytes max?)
- The place in the file (max 8 bytes)
- Some other stuff
- Note that if you’re storing it this way, you’re recording it as a journal and can store every single micro-change made to the file.
- Let’s say you somehow want to store a thousand bytes of data for each character
- One thousand bytes per character times one million characters per hour totals to one gigabyte of data per hour. That may sound like a lot, but consider this: modern hard drives cost as little as five cents per gigabyte. You can find a 3 TB hard drive for about $170 here. That’s just five cents an hour to record every micro-change made.
I think that businesses should strongly consider this option. I should also note that this doesn’t apply to other kinds of files, like videos or pictures or audio, nor does it apply to storing machine-generated data, like system logs.
I should also say that there must be an easy way to replace these old drives; a rather large two terabyte drive would last 2,000 hours – about a year of standard office weeks. I suggest putting these drives in a hot-swappable machine on the network, and putting the full drives in storage.