logo
Vinicius.Dev

My Blog

Some texts about stuff that I like

Context Caching in LLMs

Context Caching
Gemini
Tokens
Prompt

Cache input tokens and consult them for subsequent requests.

How to Model a NoSQL Database in a Master Blaster Way

NoSQL
DynamoDB
Non-Schema-Based
Key-Value Pairs

The senior way to model, query and understand Non-relational DBs.

What is Parallel Programming

Parallel Programming
C#
Cuda
Multi-core
Computing Architecture
Processors

Parallel programming is a type of computing architecture where multiple processes run simultaneously. This approach can significantly speed up computations by leveraging multi-core processors.

How to improve your RAG performance?

RAG Performance
Accuracy
Prompt Engineering
Benchmarks

When building a RAG application, it's very common to face low accuracy at the beginning. To overcome this issue, techniques such as prompt engineering, filtered search, score extractor and more are mandatory to reach performances above 80%.