Comments Page - Show HN: What is HN thinking? Real-time sentiment and concept analysis

« Back Show HN: What is HN thinking? Real-time sentiment and concept analysisethos.devrupt.ioSubmitted by ddtaylor 3 days ago

kretaceous 2 days ago
This is really cool and something I've envisioned building for a long time!
There is a bug in the entity tracking. For the entity "github", it shows a positive sentiment. HN does NOT like GitHub (for reasons good or bad). If you click on it, it shows you stories about other seemingly unrelated stories.
https://ethos.devrupt.io/entities/github
- ddtaylor 2 days ago
  Thank you. I believe this is because it's not properly aggregating the story title, content, and comment hierarchy. There are going to be cases where the LLM does a poor job of understanding the conversation, but I think right now the information isn't being sent to the prompt.
  Right now it seems to be only using one level of the parent comment hierarchy.
  (Source: https://github.com/devrupt-io/ethos/blob/67670eb2855b84d389d...)
sdwr 2 days ago
Awesome idea! The entity tracking is very exciting, most interesting part imo
I think the budget is noticeable in the sentiment analysis unfortunately, the tags and entity recognition are good but the sentiment ratings themselves seem pretty sloppy.
- ddtaylor 2 days ago
  I think it's mostly prompting, but I will be experimenting with this more. The prompt currently is garbage IMO
  You are an expert analyst of the Hacker News community. Analyze submissions for the underlying ideas, concepts, technologies, and entities being discussed. Write all summaries in third-person analytical prose. Do NOT start sentences with "The user", "The commenter", "The author", or "This post". Instead, lead with the substance: describe the idea, argument, or phenomenon directly. Good: "Decentralized identity systems could reduce reliance on corporate gatekeepers." Bad: "The user discusses how decentralized identity systems work."
  (Source: https://github.com/devrupt-io/ethos/blob/67670eb2855b84d389d...)
  atoav 2 days ago
  Garbage, why? That is the insightful bit you chose to omit. How would you do it instead?
  ddtaylor 2 days ago
  It leaves a lot of interpretation to the model. For example it doesn't give any guidance on concept naming or disambiguation, which leaves all of that work to the JSON schema.
  In my experience it's much more effective to reference key terms or ideas in the JSON schema and then explain those and their constraints in the system prompt.
  This is one reason why people often think one model performs better than another for tasks they are both capable of. The real question IMO becomes, does porking all of that extra input prompt (a) eat too much context or (b) increase cost too much.
  We will put an update on this in the future and post it in our blog, https://blog.devrupt.io/
tangotaylor 2 days ago
The sentiment analysis is very interesting. I'm super curious what that looks like historically, going back to 2007.
- ddtaylor 2 days ago
  I currently have it limited to this "epoch" date while I tweak the prompts, once I feel the prompt is done cooking I will be letting it go back to 2007. But, also, gotta keep the lights on somehow ;)
  Also, hello fellow taylor.
esseph 3 days ago
This is virtually identical to tools the US Department of Homeland Security uses across each social media platform and major website with comments to monitor sentiment and activities.
Congrats, I guess.
- ddtaylor 3 days ago
  I was also told this by someone randomly while working at a coffee shop here in DC. Something about CGA.
sixtyj 2 days ago
Well done.
If I could suggest, please make green colors more distinct in sentiment split wheel, they seem to be very similar now.
vivzkestrel 2 days ago
any blog post anywhere that explains how all of this stuff works and the architecture etc?
- ddtaylor 2 days ago
  We wrote this https://blog.devrupt.io/posts/introducing-ethos/
Lapsa 2 days ago
I'm thinking about constantly getting bombarded with audible microwave voice messages for past couple years
- ddtaylor a day ago
  Epstein was written in COBOL because of static analysis.
  Lapsa a day ago
  and how that's related?
  ddtaylor a day ago
  I figured you were testing the analysis to see where it put you lol
  Lapsa a day ago
  no, I just straight up answered the question. it's fucking annoying, you know https://ieeexplore.ieee.org/document/9366412
claudegamedev 2 days ago
Jeffrey Epstein: 0.20% Positive! Lol.
Side note: this is cool, but the sentiment analysis could be a bit more sophisticated in v2.
- CatMustard 2 days ago
  I know I'm going against the HN hivemind a bit here, and I hope I don't get flamed too much for it - but I think that that Jeff Epstein fellow wasn't a very nice man.
  ddtaylor 2 days ago
  guidelines link
dk8996 3 days ago
Very interesting. LLMs open up space for transforming unstructured raw data into visualizations and dashboards. I made something just looking at “Who wants to be hired” posts.
https://hireindex.xyz/#stats
- ddtaylor 3 days ago
  Does that use the "real" LinkedIn API or something else like Playwright?
  What model does it use?
  What vector database is it using?