Everything runs on AWS. The infrastructure is set up with Terraform. The Lambda retrieves three C1 level words in Dutch, their translations, and an example from ChatGpt. Those words are stored in DynamoDB so they will not be sent again. They are then sent to my email.
I didn't want to pay for expensive vocabulary apps that often start with beginner words while I am looking for advanced vocabulary, so I built it myself.
The idea is really nice, but AWS sounds overkill? Using the same Python file with an Sqlite db (or a text file) and using an API like Mailgun to send the emails, it could run on any machine with a plain cron job?
I built a comparable system that sends me an email every day that I can respond to, to maintain a journal; it works like described above and has been running for about 5 years now with zero downtime.
Anyway the idea is really good!
> using an API like Mailgun to send the emails
Don't need that. You're already paying for (or self host) your primary email address right? That includes sending emails from that email address. Use those same login credentials to send emails to yourself, no need to contract a third party for sending a handful of emails per day, especially to yourself
Yep, might be a bit overkill, but as mentioned in other comments, this project was more for fun and learning, less for efficiency :)
I agree completely. As a Dutch myself: "overdaad schaadt", which also teaches some of our pragmatism. We tend to implement simple solutions. We have no time to waste when the water comes in.
Although, I appreciate the idea and wish luck with learning the language (as NT2 I assume). Questions welcome.
Could there be a way to instead of having the direct english translation, having it define the word in (simple) Dutch? I think this immersion would help improve understanding the language directly as opposed to route memorization, especially at the more advanced level you are targeting.
Honestly I was thinking about that. Or how to best display the new words, so I totally see your point. I might change this in the future, but for this first iteration I just thought: leave the English translation in and see how it works ...
AnkiDroid is a free, self-contained implementation of Anki for Android devices.
Are you reinventing a stove-pipe version of Anki, based on cloud services and e-mail?
Are you really going to send yourself 300 e-mails when 300 cards are due that day?
This is the opposite of Anki: words are shown only once.
For anyone else trying to achieve the same thing from scratch: if you have a Google account, Google Apps scripts might be able to do the same thing for free and without having to worry about VMs, storage, or anything else. You could store stuff on your Drive, or literally just search your own inbox for the existing word to check if it's already sent.
Goed gedaan!
The real learning isn’t language but cloud infrastructure.
This is eerily well-timed!
My partner and I do something similar for Korean & English (she’s Korean native and is fluent in English and I’m learning Korean). We actually built it out for ourselves and some friends and just released it yesterday[0].
Still working out some kinks, but it sends a question every weekday via email that you’d respond to. It then sends back feedback on vocab & grammar, all with spaced repetition baked in to keep track of words you learn/use as you continue.
It’s currently tailored towards those that can already read and have basics under their belt.
An end-to-end example of a single question would be helpful to see.
The design is superb! Seriously such an incredible looking site
화이팅!
You've made some cool stuff, inspirational.
I love the design so much.
With all respect and love to the OP, I must admit that I laughed out loud when I saw the AWS architectural diagram and wondered whether this might be a joke. Personally, I'd have implemented this as a few dozen lines of Python living as a cron job (or even as a long-lived process with a schedule), but I'm no pedigreed engineer.
Fair enough! As mentioned earlier, one reason I used AWS/Terraform is for personal learning. It may not be the most efficient approach, but I built it this way because it was the most enjoyable for me. :)
I do the same on my personal projects. Big over engineering projects for learning purposes :-)
If you're using Terraform on AWS as a learning experience I hope you're using a pre-paid card.
> With all respect and love to the OP, I must admit that I laughed out loud when I saw the AWS architectural diagram
OP actually did it more efficiently than most! You should see the AWS suggested architecture. It uses something like 10 different AWS services.
My company actually set out to solve this very problem. We have a cloud cron hosting that's more reliable than the AWS architecture but just requires a few lines of code. Literally this is all you have to do:
@DBOS.scheduled('* * * * *')
@DBOS.workflow()
def example_scheduled_workflow(scheduled_time: datetime, actual_time: datetime):
DBOS.logger.info("I am a workflow scheduled to run once a minute.")
https://github.com/dbos-inc/dbos-demo-apps/blob/main/python/...I think this is where Cloudflare shines. They just focussed on the essentials with Workers (“serverless”) at the core of everything instead of VPS at the core of everything.
Yes, DBOS has a similar philosophy. Strip away all the hard and annoying parts, let you just code. Our other philosophy is "just do it in Postgres". :)
FWIW you can't really do the same thing on Cloudflare workers -- their crons are "best effort", and you'd still need to get storage somewhere else. With DBOS the storage is built right in.
Cloudflare Durable Objects have alarms you can use to imitate cron, and have storage built-in (there's even support for SQLite databases attached to DOs in beta)
You’re not kidding about AWS’s own architecture diagrams.
Although if you drew that out you'd have about the same.
Cron trigger.
Process.
Gpt API.
Database for persistence.
Email sender.
Which part of that wouldn't you have?
This is a great fit for Google AppScript.
Who likes to learn a niche scripting language that only works on one platform?
It's ordinary javascript. It interacts with many google services including gmail without having to actually maintain servers or setup authentication. It's perfect for small little glue tasks like sending yourself emails or anything that interacts with Sheets. You wouldn't use it if you weren't trying to use a Google service.
The first example in the first screenshot isn't very idiomatic. I'd say hulp rather than 'guidance' when filling out a form. It works, but I don't know that anyone would say that
The second one, I'd say either bepalen/beslissen (if you want to make a decision) or uitvinden ("out-finding", find out). The word from the screenshot, vaststellen (literally: "fixed setting", think of it as fixating), is still in common enough use, particularly in formal writing, but more of a word for "good to know" than to use in active vocabulary
No comments on the third one :) That's idiomatic use (though I'd have thought of, just like in English, "not falling over" as opposed to "work-life balance" as the defining meaning of the word)
The readme says the examples are generated using ChatGPT. Why not use an existing dataset instead of generating mediocre examples with lots of energy? Similar to what YouGlish(.com) does, you could get a lot of sentences spoken by native speakers from YouTube transcripts for example, or Wikipedia for written language, or other sources costing virtually no energy at all to find a word in and being better as well
I see your point! I also wouldn't see ChatGpt as the ultimate source of language learning. I just occasionally used it to generate some words for me, and I found it helpful, so I just automated that. I like the idea of getting something out of transcripts, that would make it more realistic and practical!
I love this idea!! I'm working on Dutch learning as well and made a learners immersion dictionary for it;
So going by the screenshot in the readme where you have vaststellen; https://hetnederlands.com/dictionary/vaststellen
The things you can do with language learning and LLMs is just incredible :)
Oh neat, how did you generate that data?
Nuenki uses processed wiktionary data. Its definition for that word is this: https://dictionary.nuenki.app/get_definition?language=Dutch&...
(ofc rendered nicely in the client).
just pure o4 mini in a sequence of prompts. I haven't done any fine-tuning or involved any dict api (yet), but the accuracy/quality can definitely be improved from there.
For dutch, I would use something like van dale or woordenlijst (het groene boekje), both have free online versions.
What I really want are automated emails interspersed during the workday with my overdue Anki cards. It should be one click straight from the email to answer the quiz card, and appropriately rescheduled to my inbox in case of a memory miss. Spaced repetition quizzing is essential to memorizing anything, and Anki is really the most popular app in the world for that purpose.
I already spend all my time in the inbox and find it hard to ignore an email. Inbox zero habits would kick in and ensure that I do at least some memorization every day. A single Anki card in my inbox is far less daunting than the entire deck staring at me when I open the app.
Unfortunately Anki doesn't have a proper API and isn't easy to reverse engineer. I tried to build something using a scraper that logs in to the Anki web app, but it turned out to be very janky, and couldn't identify overdue cards. Somebody with better desktop app/python skills could probably do it locally, but I gave up.
> Unfortunately Anki doesn't have a proper API and isn't easy to reverse engineer
Tried any of the below?
AnkiConnect (HTTP API): https://git.foosoft.net/alex/anki-connect
Rust: https://github.com/ankitects/anki/tree/main/rslib via Protobuf: https://github.com/ankitects/anki/tree/main/proto/anki
Rough DB Schema (outdated, but sufficient): https://github.com/ankidroid/Anki-Android/wiki/Database-Stru...
AnkiConnect is decent but still relies on an actual Anki desktop application to be running.
What I do is put the Anki widget quite front-and-center on my phone. Whenever I absentmindedly unlock my phone, the red squircle containing a positive number activates my monkey brain and I want to get it to 0.
I agree that spaced repetition is essential and Anki is just the main player. I think the ideal product would combine: a flashcard app like anki, automated emails you can reply to, audio nudges and more ...
Using the anki connect addon you could do all this in under an hour fyi
I made a somewhat similar zsh shell script the other day: U wanted to receive daily notifications about topics I'm comfortable with but in spanish b1 level and at semi random interval. It's just a couple lines of zsh : https://github.com/thiswillbeyourgithub/Daily_Fact_Ntfy
Good busy!
Is this a decent fit for LLM?
“Talk to me in <language> and point out my grammar errors in English”
I imagine it’s risky, learning bad habits. But it seems like it might be very convenient. I believe the biggest issue for me is actually using a language regularly. But I’m way too socially afraid to do one of those “speak to a random person live” things.
Or even some sort of, “translate all my emails to <language>, but show English when I mouse over.”
I bolstered my French by setting almost all my video games to French in university. It helped me a ton, and was accessible because I understood the context.
Translation tech has come a long way. Might not even need LLMs.
The one and only actually useful use-case I've found for ChatGPT in my life (since it can't handle assisting my extremely basic coding work) has been "break this Japanese sentence down word-by-word and explain the grammar." On the surface, it seems more helpful for understanding and learning than simply putting the words into a JP/EN dictionary (which doesn't explain grammar at all) or putting the entire phrase into Google/Bing Translate (which makes it too easy to mentally ignore the grammar points I need to learn).
Reading the other couple of replies, though, maybe I should rethink doing even that.
Yes, as always it’s risky to use a LLM for something you’re not already familiar with. I guess for English or Spanish it’s good because it has a large corpus, but for a smaller language like Italian it’s quite bad.
Tried having it generate German puzzles (normal sentence with a missing word like "der" or "dem" or so) after someone blogged about that it would be worth like 90% of a language teacher for 1% of the price. I'm not very good at German but most things it proposed seemed wrong to me. The whole point is that I don't have to talk to a native speaker but I decided to show the conversation to one who then said something like "yeah no, you're correct half the time and the computer is wrong even more times"
Maybe I should feed it bits from Wikipedia and have it censor word classes for me (or is part-of-speech identification by human-made algorithms reliable?), but that's a lot more involved to code up than prompting it "hey just do this task". I'm sure I'm just holding it wrong and it can be a useful language teacher in some way, e.g. I have had good results with 1:1 translations, but don't expect it just does what you ask it when you can't verify the result
How much does it cost to run???
A dozen events and seconds of runtime per month? If free tier itself had a free tier, it would be a blip on it.
Indeed, it's peanuts :) I didn't calculate it as I find that cost insignificant.
Agree with others. This is overkill. You can have similar effect by following Dutch social media accounts (libraries, museums, and bookstores are particularly good), subscribing to Dutch news email newsletters, even changing your OS on your phone and computer to Dutch if your Dutch is already at a good enough level.
Yes, there are better ways for sure. This shall just be a small piece in the puzzle :)
I have built something similar except with a list of warm up exercises and with GitHub Actions.
I suppose a bank of words on a .CSV, a script which selects words, and a job triggered via a ChronJob which opens an issue does the trick. I had it so when an issue is opened, I got emailed.
The pro of this approach is you don’t have to deploy any infra. The con is that your emails never look as nice as you got it :’)
My approach to learning Dutch is probably a bit unusual. I import and sell Dutch bicycles and bike parts. Turns out, this is very difficult to do without accidentally learning some Dutch :) (It's all the wrong variety of Dutch, though: I can talk about bike mechanics, but cannot ask for directions.)
Since you're using python, I may as well plug the py-fsrs project in case you wanna add spaced repetition to it: https://github.com/open-spaced-repetition/py-fsrs
Wow. I was thinking of implementing spaced repetition for a project I’m working on, this will be very useful. Thank you for creating it.
I built a small personal service to do this for Japanese. Five words + one idiom every day at 9 a.m. It's certainly not the best way to learn/study, but it is a nice passive way to stay engaged with the language.
I think that's spot on. It's not about writing perfect software for learning a language. It's just a little extra to keep you engaged and reminded!!
Goed bezig!
What's the source of the words/dictionary? Where are you storing them?
The words are generated by ChatGpt Api, and I store them in DynamoDB.
You could use the data you've collected in the DB to generate a quiz that tests your knowledge of the words. If you track how many times you entered the correct answer and sort by descending order on that field you will be presented with the least known words first. Easy alternative to spaced repetition.
I initially imagined a script that would send an email generated by an LLM that you could reply to in the target language. Basically, an LLM pen pal that will email you regularly. Seems like a fun idea.
I did something similar for German, but in my case I used Claude to generate a full dialog, plus the English translation.
I was much more lazy, though, and set this up on Zapier.
What I do is play language games with LLMs. And ask them to explain what I'm doing wrong, it's much nicer than Duolingo.
great idea. imo there's still tons of business opportunity in email, even if people see it as legacy. that makes it more compelling, because you'll face lots of addressable market and less competition.
You could generalize this into all sorts of reminders , notices, affirmations, quotes.
this is similar except for learning Chinese and it publishes videos to youtube and they have simulations generated in the videos: https://www.youtube.com/watch?v=4R3zudq9v8M
Or you can just read books, comics and newspapers, and watch tv shows and movies.
See also Dr Krashen's comprehensible (not comprehensive) input theory. Lots of YouTube channels that offer graded videos in this style (to various degrees of adhering to the theory, i.e. some are simply grammar lessons which is not CI). The most well known is probably Dreaming Spanish.
I've had really great success with my children using national TV networks iOS apps with a VPN, e.g. SVT Barn (Swedish), WDR/ZDF/ARD (German), etc
You could add tracking to build an anki like system for repetition and learning.
I want to do that if I ever find the time. Adding a date to the database entries, and some code to throw an old word in here and there based on spaced repetition best practices.
Using speech to text you could say the answer and it could validate your answer. If AI engine is powerful enough it could have you say the foreign word and rate your pronunciation.
As for spaced repetition I developed an alternative which just has a column for number of times correct answer was given and order by descending order on that field. This gives you new words first followed by words you've barely gotten correct etc
> build an anki like system
...or use Anki? Set a calendar reminder to open the app, then there's a similar notification area trigger as with emails
Too much friction, it ruins the email system he made.
cool project! E-mail seems like a good channel for small chunks of language-learning content + reminders.
If I may ask you, how do you plan on building vocabulary from these e-mails? Do you use anki or some other method?
I did some language courses, so now I just want to improve my vocabulary. I used anki for a while but once I got out of it I found it hard to get in again. That's why I like those emails, they don't take much time and you can start every day again. Otherwise I just try to immerse myself in the language with youtube, netflix ... :)
..videoland ;)
Cool project!
Seems a bit complex though, compared to doing a shell script showing a notification or sending yourself an email each morning when you open it?
Or just doing a light script on val.town?
For instance this could be an example val.town script that does something similar (just need to bind to a data source for the dictionary)
import { sqlite } from "https://esm.town/v/stevekrouse/sqlite";
import { OpenAI } from "https://esm.town/v/std/openai";
import { email } from "https://esm.town/v/std/email";
// Dutch words database
const dutchWords = [
{ word: "boek", translation: "book" },
{ word: "huis", translation: "house" },
{ word: "boom", translation: "tree" },
{ word: "water", translation: "water" },
{ word: "kat", translation: "cat" },
{ word: "hond", translation: "dog" },
{ word: "appel", translation: "apple" },
{ word: "tafel", translation: "table" },
{ word: "school", translation: "school" },
{ word: "fiets", translation: "bicycle" }
];
export default async function generateDutchWordLearning() {
const KEY = new URL(import.meta.url).pathname.split("/").at(-1);
const openai = new OpenAI();
// Ensure SQLite table exists
await sqlite.execute(`
CREATE TABLE IF NOT EXISTS ${KEY}_dutch_words (
word TEXT PRIMARY KEY,
translation TEXT,
example TEXT,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
`);
// Fetch words not previously used
const usedWords = await sqlite.execute(`
SELECT word FROM ${KEY}_dutch_words
`);
const availableWords = dutchWords.filter(
w => !usedWords.rows.some(row => row.word === w.word)
);
if (availableWords.length < 3) {
// Reset if we've used all words
await sqlite.execute(`DELETE FROM ${KEY}_dutch_words`);
availableWords = dutchWords;
}
// Randomly select 3 unique words
const selectedWords = [];
for (let i = 0; i < 3; i++) {
const randomIndex = Math.floor(Math.random() * availableWords.length);
selectedWords.push(availableWords.splice(randomIndex, 1)[0]);
}
// Generate example sentences with ChatGPT
const wordDetails = await Promise.all(selectedWords.map(async (wordObj) => {
const exampleResponse = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{
role: "user",
content: `Geef een voorbeeld zin met het woord "${wordObj.word}" in het Nederlands.`
}]
});
const example = exampleResponse.choices[0].message.content || "Geen voorbeeld gevonden";
// Store in database
await sqlite.execute(`
INSERT INTO ${KEY}_dutch_words (word, translation, example)
VALUES (?, ?, ?)
`, [wordObj.word, wordObj.translation, example]);
return { ...wordObj, example };
}));
// Prepare HTML email
const htmlContent = `
<html>
<body>
<h2>Dutch Word Learning </h2>
${wordDetails.map(w => `
<div>
<h3>${w.word} (${w.translation})</h3>
<p><em>Example:</em> ${w.example}</p>
</div>
`).join('')}
</body>
</html>
`;
// Send email
await email({
subject: "Your Daily Dutch Words ",
html: htmlContent,
text: wordDetails.map(w =>
`${w.word} (${w.translation}): ${w.example}`
).join('\n')
});
return wordDetails;
}
I wouldn't argue that it's rather complex for what it does. The reason I still did it this way was that I want to get them automated, without doing anything manually. Even if I would need to just open my laptop, or run a script once, I think I would just stop at one point, and I don't think it would ever become a habit. Are there other tools that could probably get this project done with less complexity? Probably, but I have the pride of an engineer and wanted to brush up on my Terraform ;)
the val.town way doesn't require you to open your laptop... it's just "lighter" than having a whole terraform infra
Is that code from an LLM?
Based on the style with comments above each block it seems very likely to be from chatgpt or claude
Yeah.
Kind of weird we have people submitting GPT samples to people that likely have GPT themselves and could ask it for one if that's what they wanted.
But then plenty of people link google searches as though that makes sense.
yes, basically just asked the val.town AI bot to write it, probably need a few bugfixes here and there, but the idea was to show that there are services that do that in 50 lines of codes, rather than spanning a big infra
Nice work
Nice work
Duo Lingo?
I remember reading a joke once...
What's the hardest European language to learn?
Dutch.
Why?
Because every time you speak to them in Dutch they respond to you in English.
It seems this is a way around that :-D
(I don't actually think it's the hardest language but have found that yes, many Dutch speak English very well)
I am in Amsterdam right now and yes, I have yet to encounter a Dutch person that doesn’t speak very fluent English.
As a Dutchman from outside of Amsterdam (you know, most of us):
Hah!
It's not even that they won't speak Dutch, often they can't! Sometimes you'll be hard-pressed to find someone capable of speaking Dutch in Amsterdam in some shops and restaurants. I've had people look sheepish/annoyed for presuming to use and expect Dutch in my own country.
Exactly that. You'll have a harder time not speaking English than not speaking Dutch.
It's not the norm anywhere outside of Amsterdam I'd say, but indeed, we had a server/waiter(?) in a Greek restaurant in Limburg yesterday who spoke German but not Dutch (who looked like they might be from Greece so I doubt they were simply from Germany). Especially since the pandemic I've been noticing this more. I like the culture mingling, all the better that the Limburgians see foreigners aren't scary and evil, but I'm curious if it's a trend or if I'm just randomly noticing it more
Try a less touristy areas though, or people you don't normally interact with much (who will, conversely, also not have much experience interacting with non-Dutch people). My grandma couldn't say more than yes or no and understand not much more
Working an IT job in a company of ~30 employees, someone joined who didn't speak Dutch. They would always excuse themselves and have lunch in their office¹ because it was very obvious that half the people just didn't really interact with the previously lively conversation anymore and were just biding their time to get back to work. Those who did speak, it worked but it's not as jovial as before. Sure, these people can all hold a presentation about their field of work, or order a sandwich with the correct words in England, but a spontaneous conversation about something random? It's a different set of vocabulary that you need every day, and far from everyone has that
¹ yes, we made clear they shouldn't do that and they should feel invited and part of the team. Many people did interact. And many of us made sure they were, at least, not having lunch alone in their office. Situation unfortunately remained as it was until I left
I speak Dutch fluently (born and raised) and even I have a hard time to speak Dutch with Dutch people. If you don't fit the profile (blond hair/blue eyes) they automatically assume you're a foreigner.
Oh yeah we only allow people to speak poor english in very public functions, like the head of state or the secretary general of NATO.
It helps make the rest of us look good.
I don't have an active memory of hearing either of them speaking poor English. Can't be true.
Clearly a dig at Mark Rutte...
It is a rather big problem, yes. You can absolutely get by without speaking any Dutch, I know people who have spent 10+ years in the country with just very basic knowledge of the language. Absolutely kills the motivation for a lot of people.
You can't even order food or drinks in Dutch anymore in a lot of places in Amsterdam. It's a bit of a bummer when you are back in your home country and can't even speak your mother tongue
Also Dutch is, let's put it this way, not the prettiest language, nor the most useful. I'm sure that also kills plenty of motivation.
I’d disagree, on the pretty front.
As I’ve learned it, I found it very charming and often surprisingly sweet - as an example idiomatic terms for urination and defecation are very funny: plassen (making a large pond) and klaaivormen (forming clay) - add to that a rather easy to rhyme language with a tendency towards charming and heartfelt emotional range, and the end result is quite nice.
Add lots of domestic and Caribbean regional variation in the home countries, close sister languages: Vlaams (certainly in its higher form a very different register of the language than the Hollands standard form), Afrikaans and West-Frisk, Papiamento etc and you’ve got a very cosy (gezellig!) and dynamic inter-language community!
The aggressive simplification of standard Dutch initially offended my tastes, but later I’ve found that particular discipline improved my English by accident and I’m now a fan of the sparse elegance and surprising nuance of that style …
I think you mean 'kleien' instead of 'klaaivormen'?
I’ve heard “kleivormen” in Hoorn, “klei maken” (a little more gross - no surprise given their famous export, disease swearing, I suppose) in Den Haag, and “kleien” below the great rivers.
Edit: spelling, never ran into the NL word for clay in writing as an adult language learner not into geophysics, civil engineering or pottery
What makes a language pretty? I'm not sure I ever saw/heard one that was pretty beyond what I'd say is in the eye/ear of the beholder
But agreed on it being pretty useless outside of a few small regions / couple million speakers. I've been saying we should apply winning team joining and get to something more internationally useful, as everyone here seems to already agree we are small and that trade and cooperation has brought the current prosperity. The area I'm from, though, people clutch to local dialects as cultural heritage that should be continued to be spoken... it doesn't even have a writing system... whatever, I don't mind so long as people are okay with a useful language alongside