• didgetmaster 16 hours ago

    I have been of the opinion for a long time, that the basic architecture of file systems is antiquated and needs to be replaced. Any data organization technique you devise for organizing your files using a traditional hierarchical folder structure, will quickly break down once the number of files gets really large and are spread across many storage devices.

    We need to move to an 'object store' architecture that can effectively manage hundreds of millions of files; attach a set of meaningful meta-data tags to each one; and find anything and everything in just a second or two of searching.

    I created such a system that works exceptionally well that is currently in open beta.

    • curious_soul 11 hours ago

      I assume it's the one linked on your profile. I'll check it out!

      • didgetmaster 6 hours ago

        Great. Didgets is a general-purpose data management system that supports multiple data models. The object store features allow it to manage file data. The tagging system I designed to attach meta-data tags to each object also allow it to manage structured data like relational databases.

        The website was designed to highlight its DB features and how it can be used to quickly analyze data in its tables. Don't let that distract you from its initial purpose as a file system replacement.

    • MountainMan1312 2 days ago

      Personally I use a single folder named "kb" (short for knowledgebase). This folder contains every single file I've considered worthy of saving. I keep backups and sync between all devices.

      The secret is in the naming conventions. I've written about mine extensively here on HN [1], so I won't go into too much detail here, but for your receipt example I'd do `fin.receipt.[date].eye-doctor-visit.png`.

      Family members files should be in those people's home folders. Mixing people on one account is just asking for trouble in my opinion. If they're an enumerable number of important documents, consider a central family folder which gets backed up frequently, with a subfolder for each person and one for shared family stuff.

      - [1]: https://news.ycombinator.com/item?id=41370673#41373817

      • curious_soul 2 days ago

        Thanks! This is the kind of thing I was interested in - some inspiration for naming/tagging or some novel approaches to organizing data. I came across this blog post which I thought was interesting: https://karl-voit.at/2022/01/29/How-to-Use-Tags/

        Regarding family members, these are just some important docs and such of family members that are not tech savvy/don't use my computer.

      • tacostakohashi 2 days ago

        I use a simple two-level hierarchy. If you add one extra level to what you have, I think you'll find it works pretty well.

        medical/eye-doctor

        medical/dermatologist

        bankA/account1234

        bankA/account5678

        bankB/account9999

        https://johnnydecimal.com/ is a slightly more elaborate system. The basic idea is the same though, just have a fixed number of levels, it turns out 2 levels works very well. You don't need to overthink the categories at each level too much, e.g. maybe bankA + bankB could be "finance" instead.

        finance/bankA-account1234

        finance/bankA-account5678

        finance/bankB-account9999

        If there number of second-level directories gets out of hand, split it up into multiple, more-specific top level directories. If you just have 1 or two second-level directories, combine into a more generic top level directory.

        Lets say you end up with 20 ish top level directories, and they each have 20 ish second level directories... now you cn have 400 ish directories (which is plenty), but you only have one "screenful" at each level to ls, navigate, etc.

        Within each folder, the files are named YYYY-MM-DD-some-name.pdf. The second level folders are generally some individual/organization/account, and all the things pertaining to that relationship go in there - so, the eye doctor receipt goes in:

        medical/joe-smith-md/2024-09-25-receipt.pdf

        The joe-smith-md subfolder has all the things from that doctor... receipts, prescriptions, charts, whatever, with dated filenames.

        • skydhash 2 days ago

          Mine is mostly purpose driven. There are the fundamentals folders. Each have their own naming and categorization strategy. Then there are the folders inside these folders to help with the above, and there are the container folders, which are mostly content types.

          The fundamentals projects are often automated or augmented with tools. If not I just think about what are the primary keys I would have when I'm looking for it (and I don't mind duplicating if the file is final, like receipts or contracts). For stuff you're currently working on, you can use versioning and timestamps to have stuff to file.

          It has become surprisingly easy to find files after that. And with tools like spotlight (macos), the result makes much more sense.

          • k310 2 days ago

            I have some very broad categories as you mentioned, but since I take and screensnap countless pictures, I could really use: 1. OCR that goes into and tags every file, and 2. Image categorization.

            Various services do one or the other (I dropped Evernote and don't use Google) but it would be nice on my private system. I use Apple picture tools mostly for amusement. They are extremely limited at categorizing.

            I have been around computers long enough to know that any software which isn't open source will go away long before my data goes away, or will be M&A'ed into something more expensive with a subscription. So I don't venture far from basic tools, even though there may be some wonderful $1595.00 app out there.

            I even lost the script that grabs the first line of a PDF with an inscrutable arxiv file name and renames the file.

            I'm afraid that as an old-timer, I might become like young people who have no idea of a file system, and rely on search. Search could be a lot better with my two requests above. And FOSS, please, so tools don't vanish or get priced into the stratosphere.

            • eternityforest 2 days ago

              I have a top level "Projects" folder, one subfolder per project.

              I have a "Clients" folder, One folder per other person or company, I'm doing anything for, a subfolder per project.

              I have "Archive" which syncs to all my devices, and from my phone it syncs to Google Docs. From my laptop it is backed to a Synology NAS with the rest of my disk. I use it for things I will actually want to see in ten years.

              I have music/videos/books folders, and I also have a "Collected" folder for anything downloaded that I think I want to keep(Sorted by category in a low effort manner).

              I don't make enough for deductions to be a thing, so I don't really have any accounting to do that isn't already done by some cloud platform somewhere.

              I don't see why multiple people's files would be in one account, that sounds like a hassle.

              • noud 2 days ago

                KISS. In my home folder (*nix based):

                  bin/
                    ...all personal bash scripts can be found here...
                  documents/
                    audiobooks/
                    backups/
                    books/
                    personal/
                    projects/
                    videos/
                    work/
                  downloads/
                  hosts/
                    ...sshfs to all important folders on several servers, no sync...
                  images/
                  music/
                  share/
                    ...for locally installed software, no sync...
                  temp/
                    ...remove all files in this folder once a month, no sync...
                
                I (r)sync these folders with all my computers once every day. I have used this structure for the last 15 years. Therefore, I know by heart where to find what. Perhaps I could move the downloads folder into my temp folder. I don't know why I don't.
                • fasa99 4 hours ago

                  I take an approach a lot like this - brief straightforward lowercase names e.g.

                  books/

                  periodicals/

                  films/

                  television/

                  That's the root directory.

                  Now for subfolders I keep a numerical scheme e.g. films/000_action films/000_docudrama books/000_science/000_biology

                  So the 000_ prefix indicates a major class - fits well with a "grep" or to sort by name in a file browser gui both.

                  Then from that I can add suffixes such as 001_90s_cartoons_megaset to signify this is a massive multi-terabyte wallop of interest

                  The upside of this approach is structure. The downside is that if one is building a good characterization of these things you soon realize a tag-based system might be better - the filesystem forces a tree structure. If these were managed by a database/indexing system, then presumably the tags would live there (not the filesystem) and the filesystem could merely be some sort of balanced tree that maximizes performance, filenames may as well be random 64 character IDs. This approach then requires some effort to organize into the tree (much effort over decades) for some structure to "bleed through" to the filesystem. But at this scale to really manage it, it's scripts and automation and indexes all the way.

                  The other hard part of it is maintenance and adding new data since data tends to grow exponentially. Still manual, could be automated if I was hardcore about it. One can also build search indexes but I personally tend to do every few months a simple "find | gzip -9 > listing.txt.gz" and then grep that for fast search in the future. There are faster ways to do this algorithmically surely, but this is perhaps the fastest way as far as minimizing the amount of time & effort I put in with a good payoff ratio for "just working"

                • undopamine a day ago

                  I just create shortcuts for resolving any potential ambiguities.

                  My backups in other devices are organized by monthly folders.

                  • ofalkaed a day ago

                    When I found myself in your situation I went to manual backups and regular cleaning of my computers/backups. If it is important I back up immediately, if it is something I will not need regularly access too it gets taken off the computer. Mostly I started treating my digital life as I treat my physical life, I do not put an addition on my home or rent a storage space when ever I run out of space, I get rid of stuff.

                    On my primary backup drive I have two extra folders, archive and holding; archive has all that stuff like taxes that I rarely if ever need to access. Holding is for all that stuff I am not sure if I really need anymore but can't quite bring myself to delete, anything that spends enough time in holding to get forgotten about just gets deleted.

                    My secondary backup drive only holds things which are actually important and the loss of would cause more than a minor inconvenience or some sadness/regret over their loss.

                    My computer has essentially the standard generic file structure of Documents, Downloads, Video, Music, Desktop, and Pictures with the addition of folders for current projects and a tmp folder and none get subdivided with nested directories (but the primary backup does). When Documents and Downloads start getting difficult to navigate I clean and backup including going through my backup drives. Desktop is for stuff I want to get to somewhat soon like a pdf I want to read, tmp is for stuff I do not need on my computer on a daily basis but may need to access semi regularly in the next month or so, it gets gone through and cleaned most any time I copy something into it.

                    It took a little time to get used too and at the start I had to invest a fair amount of time but now it just means I spend a few minutes every day cleaning and backing up and once a week or so put in 15 to 30 minutes doing it. Not much different than any physical hobby or part of life, cleaning the kitchen after cooking or sweeping up the saw dust and putting away the tools after wood working.

                    If you have a receipts directory all your receipts should go in it but I would question the need for a dedicated receipts directory, if a receipt is important it should probably fall under another category (like health or taxes/writeoffs) if they don't they probably are not something you need to keep long term and can go in a temporary folder that regularly gets cleaned out, I use my documents folder for those. Other family members should have their own directory for everything that is theirs, give them a quota and let them deal with it.

                    • mmphosis 2 days ago

                      Move "away" from home:

                        /away/
                          3rd/
                          doc/
                          pub/
                          src/
                      
                      Stop using files except one file to rule them all:

                        database.sqlite
                      • paulcole 2 days ago

                        I will put stuff like tax returns and my apartment lease into iCloud and just delete everything else.

                        I never end up coming back to anything and thinking about organizing files is just not fun to me.

                        • skydhash 2 days ago

                          Take care with iCloud as it's just a syncing tool, not a backup one. You should probably have timestamped backups in some (encrypted) drives or other online storage.

                          • paulcole 2 days ago

                            I honestly have never looked at them again. If iCloud loses them, that's life. I just don't care enough to be bothered. Data storage isn't a hobby of mine.

                          • curious_soul 2 days ago

                            I don't like having my data in the cloud. I just use syncthing for now for backups, but setting up a proper NAS and improving my backup strategy is on my todo list.

                            • paulcole 2 days ago

                              Isn't it easier to just like having your data in the cloud and not spend your time on setting up a proper NAS and improving your backup strategy? Or is it more that you find that kind of tinkering fun and a good use of your time?

                              Cloud is good enough for millions of people. What makes you different?

                              • curious_soul 2 days ago

                                I care about data privacy.

                                • paulcole a day ago

                                  Fair enough. Have fun setting up the NAS.

                                  • deafpolygon a day ago

                                    iCloud has E2EE with ADP, if that's something that would interest you.

                            • gaws 2 days ago

                              I use the Johnny Decimal system[1], and it's worked well for my organizational needs.

                              [1]: https://johnnydecimal.com/

                              • curious_soul a day ago

                                I had come across this before but had forgotten the name. Thanks!

                              • closetkantian 2 days ago

                                I stopped organizing files, and I just use everything search all the time (https://www.voidtools.com/)

                                • gaws 13 hours ago

                                  Everything is great. If only there was a similar GUI tool with the same level of performance for GNU/Linux.

                                  • curious_soul 2 days ago

                                    I use this tool too and it's amazing. But this is why file naming convention becomes important.

                                  • al_borland 2 days ago

                                    About 20 years ago I read a post on LifeHacker about having folders a-z (and one for numbers) at the top level and using that to avoid a never ended list of top level folders.

                                    I set that up and still have it. It’s not my day-to-day management, but it’s my filing system. If I need to find my childhood vaccination records, I go to H/Health/vaccinations.txt

                                    It’s not perfect. I need to remember what I called something. Health vs Medical or having separate sub folders that might live somewhere else… but I just do what makes sense to my brain at the time, and usually that’s where I end up looking later.

                                    I’ve tried looking up more prescriptive systems, and they never work for me, because it works how someone else’s brain works, not mine.

                                    • beryilma a day ago

                                      I have been doing this for more than 20 years and it has been working well for me. I have a Works folder and 26 subfolders under it, each subfolder therein starting with the same letter...

                                    • BobbyTables2 2 days ago

                                      Import documents, photos, worthy personal projects, … delete all the rest!

                                      • slightwinder 20 hours ago

                                        > Currently I just have folders for high level topics like Media, Finance, Projects, etc.

                                        Why Media? Not enough files, or do you have just one mediatype?

                                        > I am finding my current nested folder structure sub-optimal.

                                        Add more subfolders and granularity.

                                        I organize my files by purpose, and then go deeper.

                                        My root is called "Data", and each subfolder is a general category. For example, for media I have "Pictures", "Audio", "Video". For documents I "Documents" and "EBooks", "Knowledge Base". For Software I have "Apps", "Code", "Software", "Docker", "VirtualEnv". For configs I have "Config" and "Config NoGit" (for configs containing binary data, not suitable for git).

                                        And each of them has more subfolders. Audio has for example "Audiobooks", "Music", "Podcasts" and "Sounds" (which are short sound-clips). "Code" is just a collection of git-repots with sourcecode. While "Apps" is full of different Subfolders like "Appimages" and "bin", where I store AppImages and binaries of apps I use. But I also have "Firefox" and "Thunderbird" for a local extracted installation of those. "Software" on the other side contains archives and install-files for software, but also roms, and other 'dead' archived files. And "Docker" is full with subfolders with data-dirs for the docker-containers I use. "Documents" contains subfolders for each company I have documents from, while "EBooks" is full of subfolders with libs from Calibre for Ebooks (meaning commercialized documents, which I can find in a database).

                                        > * Finding files becomes difficult

                                        I don't really have this problem, because everything has an obvious place, but I've developed and fine-tuned this over the course of nearly 30 years. The big task is always to figure out what is the obvious to you, and how does it work with the constraints you have. What worked today, might not work tomorrow, or in 10 years. You should be willing to adapt and fine-tune your organization every some years.

                                        > Confusion when there is ambiguity: e.g. does my eye doctor's receipt belong under Receipts or Health?

                                        What is its purpose? It's a receipt, so it should be with receipts. If you don't have a folder for receipts, you could maintain a folder dedicated to your doctor. If you do not have enough files related to you doctor, I would think Finances has a more obvious purpose than Health. You should try to replicate your train of thought when you will search this file. So you need to be somewhat flexible in your organization,

                                        • Nales 9 hours ago

                                          I think you should really check out Johnny.Decimal[0] as a lot of other comments mentioned. Even if you do not apply the decimal prefixes, it will help you reconsider your file organization. The author explains well his concepts of area and categories.

                                          About what I do. In my home directory on my computer, I have one directory called "Zarchive" where I put all the stuff I want to keep. (I called it "Zarchive" because it is a French pun. I keep "mes archives" in it.) The other directories in my home directory are symbolic links/shortcuts linking to directories in "Zarchive" or they are things I do not care to lose.

                                          To organize things in my special directory, I first used Johnny.Decimal. Then, I got rid off the decimal prefixes because I am the sole user of this file hierarchy.

                                          Here is a summary of the directory organization in this directory. I did not put all the directories but you get the idea.

                                              .
                                              ├── index.txt           A file explaning what each directory are for.
                                              ├── Configuration       Conf for my PC (dotfiles, scripts...).
                                              ├── Knowledge           My notes.
                                              ├── Media               Things I keep but I could find online or in store.
                                              │   ├── Books
                                              │   ├── Games
                                              │   └── Videos
                                              ├── Personal            All things that are personal (pictures, GTD...).
                                              ├── TODO                Things I should do one day (
                                              │   ├── 2024-08-17 Articles to read
                                              │   └── 2024-08-17 Phone pictures
                                              ├── Various             Things that I do not know where to put.
                                              └── Work                Anything work related (personal or professional)
                                                  ├── Companies
                                                  ├── Employment
                                                  ├── Freelance
                                                  └── Projects
                                          
                                          The big advantage of having a good file hierarchy in one directory is that it makes archiving way way easier. For example when I sync my data with an external hard drive, I can use a script to compare what files changed and another script to copy things between my computer and the hard drive while automatically making zip in my computer to avoid losing something by accident.

                                          Maybe I could do a full blog post on my file organization and its management if you want.

                                          Anyway, to get back on your topic. If I were you, I would put all the files you mentioned in the directory "Personal". Now for your example for the eye doctor: or you could create a directory "2024-09-26 Eye doctor appointment" with everything in it; or you could have directories named "Receipts" and "Health", and in Health you have a symbolic link/shortcut to the receipt in Receipts; or you could have a copy of the receipt in both directory. I prefer to make each directory independent, so I am not fond of the shortcut solution.

                                          For your family member's files, you could have a directory for each of them. Maybe you could create a directory "Family" in the top directory and each member is responsible for his or her stuff.

                                          With that said, it does not completely remove the need to search for files. Yes after doing that you are now organized with a good file hierarchy, but that does not mean you cannot use a good tool for finding files. I do not have much suggestion on this domain because I am lucky enough to be good at remembering where I put stuff. Most of the time... :)

                                          As a final word: I really encourage you first to read Johnny.Decimal, then to really take the time to define the areas and categories without rushing, then to organize the files, and finally to use the search tools. Some people say that you do not need to organize anything and that you can rely only on the search tools but I completely disagree. Tools are not available forever, and it does not solve the issue of syncing correctly your files and archiving them.

                                          I hope I was clear. I did not expect rambling so much when I started to write this post. Good luck for organizing your stuff!

                                          [0] https://johnnydecimal.com/