logo
Vinicius.Dev

My Blog

Some texts about stuff that I like

Why is DynamoDB So Blazing Fast? A Deep Dive into Hash Tables and NoSQL Performance

DynamoDB
Hash Tables
NoSQL
AWS

Did already ask yourself why GET methods in DynamoDb are so fast?.

Rust for Blazing-Fast LLM Applications

Rust
GenerativeAI
Concurrency
LLMs

Python's great for prototyping and has amazing libraries like Hugging Face's Transformers, but when performance matters, Rust shines.

Python ThreadPoolExecutor and Concurrency: The Complete Guide

Parallel Programming
I/O Bound
CPU Bound

ThreadPoolExecutor is part of Python's `concurrent.futures` module, which was introduced in Python 3.2. It provides a high-level interface for asynchronously executing callables.

Uncovering The Needle

AI Engineering
LLM Accuracy
Needle In a Haystack
Machine Learning
Generative AI
Benchmarks

In the LLM world, 'needle in a haystack' refers to the challenge of finding specific, accurate information within the vast amount of data these models have been trained on.

How to improve your RAG performance?

RAG Performance
Accuracy
Prompt Engineering
Benchmarks

When building a RAG application, it's very common to face low accuracy at the beginning. To overcome this issue, techniques such as prompt engineering, filtered search, score extractor and more are mandatory to reach performances above 80%.

Context Caching in LLMs

Context Caching
Gemini
Tokens
Prompt

Cache input tokens and consult them for subsequent requests.

What is Parallel Programming

Parallel Programming
C#
Cuda
Multi-core
Computing Architecture
Processors

Parallel programming is a type of computing architecture where multiple processes run simultaneously. This approach can significantly speed up computations by leveraging multi-core processors.