JVM
-
TL;DR: You’re building an AI-powered app that needs to send lots of prompts to OpenAI. Instead of sending them one by one, you want to do it in bulk — efficiently and safely. This is how you can use Spring AI with Java Virtual Threads to process hundreds of prompts in parallel. When calling LLM APIs like…
-
•
14 min read
TL;DR:You’re building a semantic search app using Spring Boot and Redis. Instead of matching exact words, semantic search finds meaning using Vector Similarity Search (VSS). It works by turning movie synopses into vectors with embedding models, storing them in Redis (as a vector database), and finding the closest matches to user queries. Video: What is semantic search? A traditional searching system works by matching the words a user types…
-
•
16 min read
This content is also available on YouTube. Check it out! The Sliding Window Counter offers a more efficient way to handle rate limiting compared to the Sliding Window Log. While the Sliding Window Log keeps an exact log of timestamps for each request, allowing precise tracking over a rolling time period, this precision comes at the cost of higher…
-
•
14 min read
This article is also available on YouTube. Check it out! The Sliding Window Log is a more precise way to handle rate limiting. Instead of splitting time into fixed intervals like the Fixed Window Counter , it keeps a log of timestamps for each request. This allows it to track requests over a rolling time…
-
•
14 min read
This article is also available on YouTube! The Token Bucket algorithm is a flexible and efficient rate-limiting mechanism. It works by filling a bucket with tokens at a fixed rate (e.g., one token per second). Each request consumes a token, and if no tokens are available, the request is rejected. The bucket has a maximum…
-
•
14 min read
This article is also available on YouTube! The Fixed Window Counter is the simplest and most straightforward rate-limiting algorithm. It divides time into fixed intervals (e.g., seconds, minutes, or hours) and counts the number of requests within each interval. If the count exceeds a predefined threshold, the requests are rejected until the next interval begins. Looking for…
-
•
4 min read
It all started with my attendance at Brian Sam-Bodden’s presentation at Spring IO 2022, which rose a great curiosity about how powerful Redis could be as a primary database. The main reason was speed, a writing operation takes less than a millisecond to be performed due to Redis data being stored in memory. I needed to…