Recently, I had data in a users.json file which was taking a lot of time to load in VsCode as the file was too large (surprising to me because it was a 29mb file), I wanted to use this chance to play around with pythons' memory usage, I loaded the file all into memory and it worked as expected.
Although I have a question, more of me needing an explanation, forgive me if its' answer is too obvious;
When I made an introspection on the loaded
jsonobject, I found out that the object size (1.3mb) was way less that the file size (29.6mb) on my file system (MacOS), how could this be? The difference in size is just too much to ignore. To make things worse, i had a smaller file and that file returned similar size results (on-disk/loaded, ~358kb), haha.
import json
with open('users.json') as infile:
data = json.load(infile)
print(f'Object Item Count: {len(data):,} items \nObject Size: {data.__sizeof__():,} bytes)
Using sys.getsizeof(data) would return something similar, maybe with some gc overhead.
This returns the accurate size of the file on disk (29586765 bytes, 29mb)
from pathlib import Path
Path('users.json').stat().st_size
Please can someone explain to me what is happening, one would think that there should be similarity in size or maybe i'm wrong.