• est 5 days ago

    While these kind of articles are useful for learners, I hope someone please explain concurrency models for uWSGI/Gunicorn/uvicorn/gevent. Like how long does global variables live? How does context switching (like the magic request from Flask) work? How to spawn async background tasks? Is it safe to mix task schedulers inside web code? How to measure when concurrency is full and how to scale? What data can be shared between executors and how? How to detect back-pressure? How to interrupt long-running function when client disconnects (nginx 499)? How to proper handle unix signals for threads/multiprocess/asyncio?

    I reality no one writes from scratch with threads, processes or asyncio unless you are a library author.

    • bmitc 3 days ago

      > I reality no one writes from scratch with threads, processes or asyncio unless you are a library author.

      Is that really true in the case of asyncio? From my experience, it isn't.

      I am not familiar with the other three libraries you mentioned, but gevent predates asyncio and is separate from it. It does not build on top of asyncio, and is outdated because of that. It doesn't even have support for WebSockets.

    • WoodenChair 5 days ago

      Unfortunately this starts with a definition of concurrency quoted from the Python wiki [0] which is imprecise: "Concurrency in programming means that multiple computations happen at the same time."

      Not necessarily. It means multiple computations could happen at the same time. Wikipedia has a broader definition [1]: "Concurrency refers to the ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching), sharing resources and managing interactions."

      In other words it could be at the same time or it could be context switching (quickly changing from one to another). Parallel [2] means explicitly at the same time: "Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously."

      0: https://wiki.python.org/moin/Concurrency

      1: https://en.wikipedia.org/wiki/Concurrency_(computer_science)

      2: https://en.wikipedia.org/wiki/Parallel_computing

      • CoconutPilot 4 days ago

        > definition of concurrency quoted from the Python wiki [0] which is imprecise: "Concurrency in programming means that multiple computations happen at the same time."

        Surprising to some, this is the literal definition. The word "concurrent" is a portmanteau from Latin, "con" translates as same and "current" translates as time. Concurrent literally means "same time". Comp Sci really needs to use a different word.

        • WoodenChair 4 days ago

          > Surprising to some, this is the literal definition.

          I don't think it's surprising to anyone that speaks English what the definition of concurrency in everyday language is. It is not the same as the computer science definition and that's unlikely to change anytime soon. Can anyone with permission fix the Python wiki?

        • Spivak 4 days ago

          Which while it sounds like a nit the OPs definition of concurrency means that asyncio as implemented in Python (and others) is not a form of concurrent programming.

          • bmitc 3 days ago

            Could you elaborate?

            While I am no fan of Python's backwards approach to multi-core programming, asyncio achieves concurrency when a coroutine reaches out to the operating system for network calls. So while two tasks in asyncio are awaiting a return from a long running network call, the operating system can be running the network calls concurrently.

        • whilenot-dev 5 days ago

          > Summary

            if cpu_intensive:
              'processes'
            else:
              if suited_for_threads:
                'threads'
              elif suited_for_asyncio:
                'asyncio'
          
          Interesting takeaway! For web services mine would be:

          1. always use asyncio

          2. use threads with asyncio.to_thread[0] to convert blocking calls to asyncio with ease

          3. use aiomultiprocess[1] for any cpu intensive tasks

          [0]: https://docs.python.org/3/library/asyncio-task.html#asyncio....

          [1]: https://aiomultiprocess.omnilib.dev/en/stable/

          • d0mine 5 days ago

            Common in practice variant: don't use pure Python for cpu-intensive tasks (offload to C extensions)

            • eternityforest 3 days ago

              And those C extensions usually release the GIL, so you can do real multithreading to some extent.

          • daelon 5 days ago

            Don't know if the author will show up here, but the code highlighting theme is almost unreadable, at least on chrome on android.

            • nosioptar 4 days ago

              Looks fine to me on Firefox/android in light mode.

              It's unreadable in dark mode.

              • tomtom1337 5 days ago

                Same on iPhone, Safari.

              • t_mahmood 4 days ago

                Wondering anyone have multiprocessing freezing on Windows? As it seems you need to have __main__ for multiprocessing to work in Windows, which I do not have as I'm using pyproject scripts with click to run. Have anyone face these issue? Is there any solution for this issue?

                • griomnib 4 days ago

                  It’s not hard to use main, and it’s a requirement for multiprocessing.

                • tomtom1337 5 days ago

                  I would have liked to see the performance of the ash c version, seems like a surprising omission.

                  I’m also confused about the perf2 performance. For the threads example it starts around 70_000 reqs/sec, while the processes example runs at 3_500 reqs/sec. That’s a 20 times difference that isn’t mentioned in the text.

                  • captaindiego 4 days ago

                    Any advice for debugging asyncio? I've tried it a few times but every time it got a bit more complicated it felt very hard to figure out what was going wrong.

                  • timonoko 5 days ago

                    Grok made better job of explaining different solutions in micropython environment. Summary:

                    * Task Parallelism and Multi-threading are good for computational tasks spread across the ESP32's dual cores.

                    * Asynchronous Programming shines in scenarios where I/O operations are predominant.

                    * Hardware Parallelism via RMT can offload tasks from the CPU, enhancing overall efficiency for specific types of applications.

                    • griomnib 4 days ago

                      Real question: why use grok over literally any other LLM? I’ve never heard of them being SOTA.

                      • timonoko 4 days ago

                        There seems to be less artificial restrictions. You can ask for example "who is Bryan Lunduke".

                        This important especially in non-english world. As forbidden words have different significance in other cultures.

                    • griomnib 4 days ago

                      The real fun is when you have multiple processes also spawning threads.

                      • c-fe 5 days ago

                        Not really anything new in there. Been dealing with python concurrency a lot and i dont find it great compared to other languages (eg kotlin).

                        One thing I am struggling with right now is how do I handle a function that its both I/O intensive and CPU-bound? To give more context, I am processing data which on paper is easy to parallelise. Say for 1000 lines of data, I have to execute my function f for each line, in any order. However f using the cpu a lot, but also doing up to 4 network requests.

                        My current approach is to divide 1000/n_cores, then launch n_cores processes and on each of them run the function f asynchronoulsy on all inputs of that process, async to handle switching on I/O. I wonder if my approach could be improved.

                        • VagabundoP 5 days ago

                          Interested in seeing if you have tried 3.13 free threading. Your usage case might be worth a test there if moving from a process to threading model isn't too much work.

                          Where does your implementation bottleneck?

                          Python concurrency does suffer from being relatively new and being bolted on to a decades old language. I'd expect the state of the art of python to be much cleaner once no-Gil is hammered on for a few release cycles.

                          As always I suggest Core.py podcast as it has a bunch of background details[1]. There are no-Gil updates throughout the series.

                          [1] https://podcasts.apple.com/us/podcast/core-py/id1712665877

                          • c-fe 4 days ago

                            the no-GIL threading indeed looks very promising, thanks for mentioning it. Unfortunately I think its just a bit too new today to use it in production - at least thats the feeling I get reading the docs.

                            • VagabundoP 3 days ago

                              Oh definitely.

                              It will need a cycle or two to mature.

                          • numba888 4 days ago

                            > I wonder if my approach could be improved.

                            Yes. When you use N batches by the number of cores the total time is defined by the slowest batch. At the end it will be just one job running. If you make batches smaller, like 1000/n_cores/k then you may get better CPU utilization and start-to-end total time. Making k too big will add overhead. Assuming n_cores==10 then k==5 may be a good compromise. Depends on start/stop time per job.

                            • c-fe 4 days ago

                              the numba in your name is convincing me to give this a try! But good idea - I see the point, If I understand you correctly you say with n_cores==10, doing 50 batches that a pool schedules on any free core will reduce the chance of one core taking much longer. I will try this out.