« BackVisualizing All ISBNsannas-archive.orgSubmitted by RyanShook 9 hours ago
  • skrebbel 5 hours ago

    I thought it was my color blindness that made me not able to distinguish between the red and green pixels as described (i only see red and black ones), but even with a browser extension that counters color blindness i can't distinguish more colors. Is this just me, or is the graph weird?

    • Finnucane 2 minutes ago

      I am also color blind and the graph is not good.

      • saithound 5 hours ago

        Fwiw (not color-blind) I can see red, green and black pixels. The graph doesn't look weird to the naked eye.

        Find the interactive visualiser by scrolling down, and switch it to "Files in Anna's Archive [md5]". This will highlight the location of the green pixels in grey.

        • superzamp 5 hours ago

          The graph seems to be alright, there are indeed red and (some) green pixels, looks like an issue with your extension unfortunately.

          • rendx 5 hours ago

            I see green dots and a few lines of green dots. Did you try zooming in?

            • psychoslave 5 hours ago

              No idea of were the issue might land, but I can see the difference in colors.

              • asfasdfasdfn 5 hours ago

                The graphs are very easy to read, albeit depend on your ability to distinguish between red and green.

                Can you change the green channel to blue to better view it?

              • WillAdams an hour ago

                The thing is, ISBNs aren't hierarchical --- they are bought in blocks (or even individually at an exorbitant markup, says the guy who bought one to reprint a single book), so this doesn't show anything really interesting/useful.

                A visualization using LoC or even Dewey Decimal would be far more useful, esp. if it also linked to public domain and copyright-free repositories/lists, say an interactive and visual version of John Mark Ockerbloom's:


                • MarceColl 42 minutes ago

                  It shows what they want to show, which is mostly how much of the world books they have. Hierarchical has nothing to do with it.

                • billpg an hour ago

                  Anyone else seeing this?

                  "This server couldn't prove that it's annas-archive.org; its security certificate is from *.hs.llnwd.net. This may be caused by a misconfiguration or an attacker intercepting your connection."

                  • c0balt an hour ago

                    No, sounds like you are being mitm for them. Though the domain appears like a legitimate CDN.

                  • greenie_beans 44 minutes ago

                    is it illegal to download and use their isbn file? like what is wrong with having that information?

                    • karel-3d 18 minutes ago

                      I don't think this page, which links to libgen and sci-hub, is that concerned about copyright.

                  • quink 4 hours ago

                    Kind of hard to tell what corresponds to what in these graphs, maybe if someone could point out Bookland (i.e. 978), it would be a bit easier to orient oneself?

                    • seszett 3 hours ago

                      Making it easier to visualise is the whole point of the bounty announced by this post.

                    • whataguy 5 hours ago

                      > Each pixel represents 2,500 ISBNs. If we have a file for an ISBN, we make that pixel more green.

                      What do you mean by "more green"? I don't see any shaded green.

                      And I presume the black pixels are unregistered ISBNs?

                      • lmm 5 hours ago

                        If you look closely there are definitely some brownish pixels and some dim greens.

                      • eporomaa 4 hours ago

                        Hm, I got:


                        European sanctions

                        The Council of Europe has decided that the websites of RT (formerly Russia Today) and Sputnik News may no longer be transmitted. The website you are trying to visit falls under this European sanction.


                        • reddalo 4 hours ago

                          I think the website is censored at DNS level but they chose the wrong error page.

                          In Italy it just errors out with a NS_ERROR_CONNECTION_REFUSED.

                          • flir 3 hours ago

                            You're just cleared up a minor mystery I never bothered to investigate (BT, UK). Thanks.

                            Flipping DNS to fixed it for now but I really need to move this connection to A&A.

                          • powerhugs 4 hours ago

                            Switch DNS to like (Cloudflare) or (Google)

                            • TonyTrapp 4 hours ago

                              Works fine here from a European IP.

                              • jaapz 4 hours ago

                                It's blocked at least in the Netherlands. Weirdly it mentions it being part of the sanctions against Russia, while from a cursory search I only found a judge ordering the site to be blocked because of copyright issues (thanks Brein). They probably just show the wrong error page?

                                • Cthulhu_ 4 hours ago

                                  Must be ISP specific, I'm also in NL and can access it fine.

                                  • rollulus 2 hours ago

                                    I'm also in NL. Ziggo's DNS server blocks it:

                                      $ dig annas-archive.org @
                                      annas-archive.org. 360 IN CNAME unavailable.for.legal.reasons.
                                      unavailable.for.legal.reasons. 339 IN A
                           serves a generic page mentioning Russia Today and the Pirate Bay. Not sure which one applies here.
                                    • rchard2scout 4 hours ago

                                      It's blocked by my corporate networking filter for me, in the category "Illegal downloads". So the Russian sanctions message is probably incorrect indeed.

                                • Over2Chars 4 hours ago

                                  "backing up all of humanities knowledge"? 41% of all books are not worth backing up.

                                  • michaelt 3 hours ago

                                    Some people in the archiving / 'data hoarding' community feel it's simpler to just back up everything. This attitude is particularly prevalent in the communities that deal with data other people have already digitised.

                                    If you're paying $100 per book for someone to visit a major library, get the book out, scan it, check the OCR? Then you'd probably be selective, to get the most out of a limited budget.

                                    But if you're grabbing epubs and pdfs, and a book only needs $0.002 of space on a hard drive somewhere? Grabbing the useless 41% is probably cheaper and easier than exercising editorial control.

                                    • Over2Chars 3 hours ago

                                      I guess we could train AI on all the human written cookbooks, romance novels, children's books, and expanding fast AI generated cookbooks, romance novels, and children's books.

                                      So yeah, let's archive them all!

                                      We'll need something to pass the time on our SpaceX rockets to Mars, where we can pass the time in our martian cave homes reading cookbooks, romance novels, and children's books.

                                    • jillesvangurp 4 hours ago

                                      The problem with such judgment is that they are subjective and subject to biases that change over time. Almost every scrap of information from ancient civilizations is considered priceless at this point because so few is left of it. Anything from obscene graffiti, shopping lists, personal messages, etc. All of it.

                                      Many autocratic regimes editorialize and censure all forms of publications. But even in the US, which is nominally still a democracy you now have states like Florida forcing changes to literature works and banning books entirely for religious and ideological reasons. And this is not just a right wing thing. There have been a few publishers that took it upon themselves to editorialize literature from the 19th and 20th century to get rid of some things that are now considered sexist, racist or otherwise offensive. The whole cancel culture is not just about canceling people, but about limiting access to their work as well.

                                      I was at a Christmas market in Berlin a few weeks ago near the Opera. There's a nice little monument there for the book burning that happened in the 1930s. Anything that was vaguely intellectual or Jewish in origin was burned right there during the Kristallnacht. Nice place for a Christmas market and a grim reminder that those calling for things to be deleted/cancelled aren't necessarily very nice people. And of course Hitler himself got cancelled. Possession or distribution of his books is still not allowed in Germany.

                                      Anyway, imagine somebody in 5000 years finding their way to some archive of hacker news or some reddit thread might look differently at the value of some of the comments than the average moderator.

                                      • heinrich5991 2 hours ago

                                        > Possession or distribution of his books is still not allowed in Germany.

                                        AFAIK this has never been true in Germany (for the book Mein Kampf at least). AFAIK the German state of Bavaria inherited Hitler's copyright on the book, and did not republish it. This means that no one was allowed to print it for copyright reasons, but you could still own or trade existing copies of the book. After 2015, 70 years after Hitler's death, the book entered the public domain. Looking into Wikipedia, uncommented reprints have been forbidden: https://en.wikipedia.org/w/index.php?title=Mein_Kampf&oldid=..., which I didn't know before.

                                        • jillesvangurp 9 minutes ago

                                          It seems you are correct and I was only half right. Lets just say that quoting the man in public is still likely to get you in trouble. More than a few AFD politicians are finding that out the hard way.

                                        • Over2Chars 3 hours ago

                                          All action is "subjective and subject to biases that change over time". This would then imply I could never take any action, because it's just subjective and biased. Maybe that's an exaggeration of your position, but you do seem to be suggesting inaction or the impossibility of judgement. I reject this position 100%.

                                          I would suggest that judgement is a critical part of our civilization, and it's judgement that says those bits of obscene graffiti in Pompeii that makes it so.

                                          Or else they could say "well, we can't claim ancient cave art is priceless, because we're biased and our biases will change over time. Maybe in a thousand years we'll discover that ancient cave art is worthless, so we'll do nothing".

                                          In fact you have judged my opinions and shared your judgement with me. Good job!

                                          Your characterization of regimes as autocratic is judgmental, biased and will change over time. But right now that's your judgement and I applaud it, even if I disagree.

                                          Gosh, book burning. Not backing up a romance novel or cookbook is definitely analogous to book burning, but I'll play along.

                                          It was a symbolic act to show a rejection of ideas, not an attempt to eradicate the books, much in the same way Gandhi encouraged the burning of foreign made clothing and products. He wasn't going to rid the world of British cloth nor were the Germans going to rid the world of non-German ideas.

                                          So yeah, when all the badly written cook books, romance novels, and children's books are in a huge bonfire, you can blame me, personally.

                                          • globnomulous 2 hours ago

                                            > All action is "subjective and subject to biases that change over time".

                                            This is poppycock. Backing up all books -- the very action discussed by the person you're answering -- is by definition neither subjective nor subject to biases.

                                            > This would then imply I could never take any action, because it's just subjective and biased.

                                            And even if the first quoted claim were true, this, too, clearly isn't. Nowhere does the comment you're answering imply that the bias or subjective rationale of an action should, ipso facto, discourage a person from taking it.

                                            Your comment is replete with similar reasoning, so warped that it's difficult to characterize as anything other than in bad faith. Indeed, this is the snottiest, rudest, least constructive comment I've seen on HN in quite some time -- excepting a couple of my snotty remarks on language or the quality of someone's writing.

                                            I have no idea what response you expect, but the only one you deserve, I think, is one that just points out your dismissiveness, sarcasm, and breathtaking contempt. What an awful way to move through the world, let alone through HN.

                                        • simpaticoder 2 hours ago

                                          Sturgeon's Law (https://en.wikipedia.org/wiki/Sturgeon%27s_law) states "90% of everything is crap" so you're not too far off.

                                          • flir 3 hours ago

                                            Everyone's 41% is different. Long tail, innit.

                                            • Over2Chars 3 hours ago


                                              • flir 2 hours ago

                                                You mention the example of romance novels above.

                                                There's a schlocky Victorian pulp novel that's of no use to anyone - except that it happens to contain a fantastically detailed description of an abandoned saltings in my hometown that nobody ever thought to record in any way. For me, those two paragraphs are gold.

                                                If the novel hadn't been digitised as part of Google's Books Archive Project, I wouldn't have been able to find those two paragraphs. Digitisation not only creates backups, it enables completely new ways of interacting with those texts (eg Google's Ngram Viewer).

                                                • Over2Chars 2 hours ago

                                                  Well I guess your one valuable paragraph that matters only to you justifies backing up millions (billions?) of human and soon to be AI generated books, because someone, somewhere, at some time will find a line or two valuable. Maybe.

                                                  I retract my position, let's back up everything!

                                                  • tiagod 25 minutes ago

                                                    I think that's the case. IIRC The British Library has copies of all published material in the UK, including flyers and such.

                                                    What seems banal and useless to you, might be extremely important for future historians, and to be honest, books are pretty compressible and storage is cheap.

                                          • sebstefan 4 hours ago

                                            >$10,000 bounty

                                            >There is much to explore here, so we’re announcing a bounty for improving the visualization above. Unlike most of our bounties, this one is time-bound. You have to submit your open source code by 2025-01-31 (23:59 UTC).

                                            >The best submission will get $6,000, second place is $3,000, and third place is $1,000.

                                            >All bounties will be awarded using Monero (XMR).

                                            ? Why are they using crypto, and, weirdly enough, specifically the crypto people use for buying drugs, to award this?

                                            Is it some kind of scam?

                                            • yawndex 4 hours ago

                                              Because the efforts of Anna's Archive are unfortunately currently very much illegal, and XMR is one of the few cryptocurrencies that can actually offer some privacy to its users.

                                              • sebstefan 3 hours ago

                                                I've used XMR before. Just surprised seeing it to pay for legitimate & harmless visualization work.

                                                I see, that makes sense

                                              • Klaus23 4 hours ago

                                                Because it is a book download site, which is illegal in every country that has copyright, and revealing one's identity with a bank transfer would be a stupid way to go to jail.

                                                • fear-anger-hate 3 hours ago

                                                  They use monero because what they are doing (copyright infringement) will get you in to big trouble anywhere in the western world. Without cryptocurrencies much of the modern large scale archival efforts wouldn't be possible, or at the very least would significantly increase risks for the people participating in it. For me this alone is a good enough reason to admit that there are valid reasons for existence of privacy coins.

                                                  The harm they may cause in the short term via tax avoidance or being used to buy drugs is minimal, but the possibility that because of them archivists are able to fund servers for data that future historians wouldn't have otherwise been able to get their hands on? Priceless.

                                                  • akimbostrawman 3 hours ago

                                                    >Why are they using crypto, and, weirdly enough, specifically the crypto people use for buying drugs, to award this?

                                                    You really have to ask why a illegal/grey site is using currency that is build to protect privacy and anonymity?

                                                    is this some kind of sarcasm?