<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Engineering Mojo]]></title><description><![CDATA[Helping engineers to be more productive and innovative.  Written by X - AVP @ Ixigo | Previous - Microsoft R& D, Oracle, Expedia. Programming | Distributed Systems | Databases | Career Growth | Behavioural Growth | System Design | Gen AI and a lot more.]]></description><link>https://anujyadav.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png</url><title>Engineering Mojo</title><link>https://anujyadav.substack.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 22 May 2026 08:17:52 GMT</lastBuildDate><atom:link href="https://anujyadav.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Anuj Yadav]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[anujyadav@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[anujyadav@substack.com]]></itunes:email><itunes:name><![CDATA[Anuj Yadav]]></itunes:name></itunes:owner><itunes:author><![CDATA[Anuj Yadav]]></itunes:author><googleplay:owner><![CDATA[anujyadav@substack.com]]></googleplay:owner><googleplay:email><![CDATA[anujyadav@substack.com]]></googleplay:email><googleplay:author><![CDATA[Anuj Yadav]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Quick note on Pydantic]]></title><description><![CDATA[Pydantic is a Python library for data validation and settings management using Python type annotations.]]></description><link>https://anujyadav.substack.com/p/quick-note-on-pydantic</link><guid isPermaLink="false">https://anujyadav.substack.com/p/quick-note-on-pydantic</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Mon, 26 May 2025 07:48:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Pydantic</strong> is a Python library for <strong>data validation and settings management</strong> using Python <strong>type annotations</strong>.</p><p>It guarantees the types and shapes of data at runtime, making your code more robust and self-documenting.</p><p>Use cases:</p><ul><li><p>API request/response validation (e.g., FastAPI)</p></li><li><p>Configuration management</p></li><li><p>User input parsing</p></li><li><p>Data cleaning from external sources (JSON, YAML, etc.)</p></li></ul><pre><code>pip install pydantic</code></pre><h2>Basic Model Usage</h2><p>Create a <code>BaseModel</code> subclass</p><pre><code>from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str
    is_active: bool = True  # default value</code></pre><p>Create an instance</p><pre><code># Create instance 
user = User(id=1, name="Alice", email="alice@example.com")
print(user)</code></pre><p>Pydantic will automatically:</p><ul><li><p>Convert types (<code>"1"</code> to <code>1</code>)</p></li><li><p>Validate required fields</p></li><li><p>Use default values where applicable</p></li></ul><p>Example</p><pre><code>user = User(id="42", name="Bob", email="bob@example.com")
print(user.id)  # 42 &#8212; automatically cast from string to int</code></pre><h2>Field Types and Constraints</h2><p>You can add <strong>field constraints</strong> with <code>Field()</code>:</p><pre><code>from pydantic import Field

class Product(BaseModel):
    name: str
    price: float = Field(..., gt=0)
    description: str = Field(default="N/A", max_length=100)

product = Product(name="Laptop", price=-999.99)</code></pre><p>Pydantic will raise an error</p><pre><code>validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for Product
price
  Input should be greater than 0 [type=greater_than, input_value=-999.99, input_type=float]
    For further information visit https://errors.pydantic.dev/2.11/v/greater_than</code></pre><h3>Common Constraints</h3><ul><li><p><code>gt</code>, <code>ge - Greater than, Greater or equal</code></p></li><li><p><code>lt</code>, <code>le - Less than, Less or equal</code></p></li><li><p><code>max_length</code> - For strings</p></li><li><p><code>min_items</code> - For lists</p></li><li><p><code>regex</code> - Match a pattern</p></li><li><p><code>default_factory</code> - Create dynamic defaults</p></li></ul><h2>Nested Models</h2><pre><code>class Address(BaseModel):
    city: str
    country: str

class Employee(BaseModel):
    name: str
    address: Address

emp = Employee(name="Jane", address={"city": "Berlin", "country": "Germany"})
print(emp.address.city)  # "Berlin"</code></pre><h2>Aliases and Field Mapping</h2><p>Sometimes JSON keys don't match your Python naming:</p><pre><code>class Book(BaseModel):
    title: str = Field(..., alias='book_title')

book = Book(book_title="Deep Work")
print(book.title)</code></pre><p>Custom Validators</p><pre><code>from pydantic import validator

class CustomUser(BaseModel):
    name: str
    email: str

    @validator("email")
    def validate_email(cls, v):
        if "@" not in v:
            raise ValueError("Invalid email")
        return v

custom_user = CustomUser(name="Alice", email="alice#example.com")</code></pre><p>Pydantic will raise error</p><pre><code>validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for CustomUser
email
  Value error, Invalid email [type=value_error, input_value='alice#example.com', input_type=str]</code></pre><h2>Parsing &amp; Serialization</h2><p>Parsing from JSON/dict</p><pre><code>data = {"id": "100", "name": "Alice", "email": "alice@example.com"}
user = User.model_validate(data)
print(user)</code></pre><p>Serialization to JSON</p><pre><code>print(user.model_dump())         # as dict
print(user.model_dump_json())    # as JSON string</code></pre><div class="pullquote"><p>This was a quick note on common Pydantic use cases. Happy coding!!</p></div>]]></content:encoded></item><item><title><![CDATA[Type Hinting & Type Checking with Mypy and Pyright in Python]]></title><description><![CDATA[Python is a dynamically typed language, meaning you don't explicitly declare variable types.]]></description><link>https://anujyadav.substack.com/p/type-hinting-and-type-checking-with</link><guid isPermaLink="false">https://anujyadav.substack.com/p/type-hinting-and-type-checking-with</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Fri, 23 May 2025 10:29:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Python is a dynamically typed language, meaning you don't explicitly declare variable types. While this offers flexibility, it can lead to runtime errors, especially in larger codebases or when working in teams. <strong>Type hinting</strong> and <strong>static type checking</strong> are powerful tools that bring some of the benefits of static typing to Python, helping you write more robust, maintainable, and understandable code.</p><p>This tutorial will guide you through:</p><ol><li><p><strong>What Type Hinting is and Why it's Important.</strong></p></li><li><p><strong>How to use </strong><code>mypy</code><strong> for static type checking.</strong></p></li><li><p><strong>How to use </strong><code>pyright</code><strong> for static type checking.</strong></p></li></ol><h3>&#128073; What is Type Hinting and Why Use It?</h3><p><strong>Type Hinting</strong> involves adding annotations to your code to indicate the expected types of variables, function arguments, and return values. These hints are <em>not</em> enforced by the Python interpreter at runtime (Python remains dynamically typed), but they are crucial for static analysis tools.</p><p><strong>Why is it important?</strong></p><ul><li><p><strong>Early Error Detection:</strong> Type checkers (like <code>mypy</code> and <code>pyright</code>) can analyze your code <em>before</em> it runs, catching potential type-related bugs that would otherwise only appear at runtime.</p></li><li><p><strong>Improved Readability and Maintainability:</strong> Type hints act as documentation, making it clearer what type of data a function expects and returns. This greatly helps other developers (and your future self!) understand and work with the code.</p></li><li><p><strong>Better IDE Support:</strong> Modern Integrated Development Environments (IDEs) and code editors leverage type hints for features like autocompletion, refactoring, and inline error checking.</p></li><li><p><strong>Refactoring Confidence:</strong> When refactoring code, type checkers can help ensure that changes don't introduce unintended type mismatches.</p></li></ul><p><strong>Basic Type Hinting Syntax:</strong></p><p>Python's <code>typing</code> module provides many tools for type hinting.</p><pre><code>from typing import List, Tuple, Dict, Optional, Union, Any

# 1. Basic types
def greet(name: str) -&gt; str:
    return f"Hello, {name}!"

# 2. List of a specific type
def sum_numbers(numbers: List[int]) -&gt; int:
    return sum(numbers)

# 3. Tuple with specific types at each position
def get_coordinates() -&gt; Tuple[float, float]:
    return (10.5, 20.3)

# 4. Dictionary with specific key and value types
def get_user_info(user_id: int) -&gt; Dict[str, str]:
    users = {1: {'name': 'Alice', 'email': 'alice@example.com'}}
    return users.get(user_id, {})

# 5. Optional type (can be a type or None)
def find_item(items: List[str], item: str) -&gt; Optional[str]:
    if item in items:
        return item
    return None

# 6. Union type (can be one of several types)
def process_input(value: Union[str, int]) -&gt; str:
    if isinstance(value, int):
        return str(value * 2)
    return value.upper()

# 7. Any type (use sparingly, means "any type")
def log_data(data: Any) -&gt; None:
    print(f"Logging: {data}")

# Example usage (no type checking here, just function calls)
print(greet("World"))
print(sum_numbers([1, 2, 3]))
print(get_coordinates())
print(get_user_info(1))
print(find_item(['apple', 'banana'], 'banana'))
print(process_input("hello"))
print(process_input(123))
log_data({'key': 'value'})</code></pre><p></p><h3>&#128073; Static Type Checking with <code>mypy</code></h3><p><code>mypy</code> is a popular static type checker for Python. It reads your code, analyzes the type hints, and reports any inconsistencies or potential type errors.</p><p><strong>Installation:</strong></p><pre><code>pip install mypy</code></pre><p><strong>Basic Usage:</strong></p><p>Let's create a file <code>my_app.py</code> with some code, including a deliberate type error:</p><pre><code># my_app.py
from typing import List

def calculate_average(numbers: List[int]) -&gt; float:
    """Calculates the average of a list of integers."""
    if not numbers:
        return 0.0
    return sum(numbers) / len(numbers)

def process_data(data_list: List[str]) -&gt; None:
    """Processes a list of strings."""
    for item in data_list:
        print(f"Processing: {item.upper()}")

# Correct usage
avg = calculate_average([10, 20, 30])
print(f"Average: {avg}")

process_data(["hello", "world"])

# Incorrect usage (type error)
# mypy should catch this!
process_data([1, 2, 3])</code></pre><p>Now, run <code>mypy</code> from your terminal in the same directory:</p><pre><code>mypy my_app.py</code></pre><p>Expected <code>mypy</code> Output:</p><pre><code>my_app.py:23: error: List item 0 has incompatible type "int"; expected "str"  [list-item]
my_app.py:23: error: List item 1 has incompatible type "int"; expected "str"  [list-item]
my_app.py:23: error: List item 2 has incompatible type "int"; expected "str"  [list-item]
Found 3 errors in 1 file (checked 1 source file)</code></pre><p><strong>Common </strong><code>mypy</code><strong> Flags:</strong></p><ul><li><p><code>--strict</code>: Enables all strict optional checks and warns about untyped definitions. Highly recommended for new projects. </p></li><li><p><code>--ignore-missing-imports</code>: Ignores imports that `mypy` cannot find type information for. Useful when libraries don't have type stubs. </p></li><li><p><code>--disallow-untyped-defs</code>: Reports an error whenever a function with no type annotations is encountered. </p></li><li><p><code>--warn-unused-ignores</code>: Warns if a `# type: ignore` comment is not necessary.</p></li></ul><p><strong>Configuration (</strong><code>mypy.ini</code><strong> or </strong><code>pyproject.toml</code><strong>):</strong></p><pre><code># mypy.ini
[mypy]
python_version = 3.9
warn_unused_configs = True
warn_redundant_casts = True
warn_unused_ignores = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
no_implicit_optional = True
strict_optional = True
check_untyped_defs = True
show_error_codes = True
pretty = True

# Exclude specific files or directories
exclude = venv/|docs/|build/

# Per-module options (e.g., for a specific library you use)
[mypy-my_app.*]
ignore_missing_imports = True</code></pre><p>Then, just run <code>mypy</code> without specifying the file: &#8220;<code>mypy .&#8221;</code> (it will find the config).</p><h3>&#128073; Static Type Checking with <code>pyright</code></h3><p><code>pyright</code> is another powerful static type checker developed by Microsoft, primarily used with VS Code's Pylance extension, but also available as a standalone CLI tool. It's known for its speed and comprehensive type analysis.</p><p><strong>Installation:</strong></p><pre><code>pip install pyright # As a Python package</code></pre><p><strong>Basic Usage:</strong></p><p>Using the same <code>my_app.py</code> file from before:</p><pre><code>pyright my_app.py</code></pre><p>Expected <code>pyright</code> Output:</p><pre><code>error: Argument of type "list[int]" cannot be assigned to parameter "data_list" of type "List[str]" in function "process_data"
    "Literal[1]" is not assignable to "str"
    "Literal[2]" is not assignable to "str"
    "Literal[3]" is not assignable to "str" (reportArgumentType)
1 error, 0 warnings, 0 informations</code></pre><p><strong>Configuration (</strong><code>pyrightconfig.json</code><strong>):</strong></p><p><code>pyright</code> uses a <code>pyrightconfig.json</code> file for configuration. Create this file in your project root:</p><pre><code>{
    "include": ["."], // Check all files in the current directory
    "exclude": ["**/__pycache__", "venv", "docs", "build"],
    "reportMissingImports": true,
    "reportMissingTypeStubs": false, // Set to true if you want to be strict about stubs
    "reportUnusedImport": true,
    "reportUnusedVariable": true,
    "pythonVersion": "3.9",
    "typeCheckingMode": "strict" // "off", "basic", "strict"
}</code></pre><p>Then, just run <code>pyright</code> without specifying the file: <code>pyright</code> (it will find the config).</p><div class="pullquote"><p>Happy Typing</p></div>]]></content:encoded></item><item><title><![CDATA[Data Classes vs NamedTuple vs TypedDict in Python]]></title><description><![CDATA[Let's break down dataclasses, NamedTuple, and TypedDict in Python.]]></description><link>https://anujyadav.substack.com/p/data-classes-vs-namedtuple-vs-typeddict</link><guid isPermaLink="false">https://anujyadav.substack.com/p/data-classes-vs-namedtuple-vs-typeddict</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Fri, 23 May 2025 01:46:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let's break down <code>dataclasses</code>, <code>NamedTuple</code>, and <code>TypedDict</code> in Python. They all serve similar purposes &#8211; creating structured data &#8211; but they differ in their mutability, type checking behavior, and overhead.</p><h2>&#128073; collections.namedtuple</h2><p><code>namedtuple</code> is part of Python's <code>collections</code> module. It's primarily used to create tuple-like objects that have named fields for each position, making the code more readable and self-documenting than plain tuples.</p><p><strong>Key Characteristics:</strong></p><ul><li><p><strong>Immutable:</strong> Once a <code>namedtuple</code> instance is created, its values cannot be changed.</p></li><li><p><strong>Tuple Subclass:</strong> <code>namedtuple</code> objects are subclasses of <code>tuple</code>. This means they inherit all tuple behaviors (e.g., immutability, support for unpacking, iteration).</p></li><li><p><strong>Lightweight:</strong> Very low memory overhead, similar to regular tuples.</p></li><li><p><strong>Runtime Type Checking (Limited):</strong> Type hints are for static analysis only; Python doesn't enforce them at runtime.</p></li><li><p><strong>Access by Name or Index:</strong> You can access elements using dot notation (e.g., <code>point.x</code>) or by index (e.g., <code>point[0]</code>).</p></li></ul><pre><code># tuple
t = (1, 2, 3)
print(t[0], t[1])

# namedtuple
from collections import namedtuple

# Define a namedtuple
Point = namedtuple('Point', ['x', 'y'])

# Create an instance
p = Point(10, 20)

print(p.x, p.y)  # Access by name
print(p[0], p[1]) # Access by index</code></pre><h3>&#9889;Immutability Sample</h3><blockquote><p>If you try to update a namedtuple member, it would result in an <code>AttributeError</code></p></blockquote><pre><code>p.x = 30 # This would raise an AttributeError (immutable)</code></pre><p></p><h2>&#128073; <code>typing.TypedDict</code></h2><p><code>TypedDict</code> is from the <code>typing</code> module and is used to define a dictionary type with a specific set of keys and their corresponding value types. It's primarily for <strong>static type checking</strong> (e.g., by tools like MyPy) and doesn't enforce types at runtime by default.</p><p><strong>Key Characteristics:</strong></p><ul><li><p><strong>Mutable:</strong> <code>TypedDict</code> instances are regular dictionaries, so their values can be changed after creation. You can also add or remove keys.</p></li><li><p><strong>Dictionary Subclass (Conceptually):</strong> It behaves like a dictionary.</p></li><li><p><strong>Static Type Checking:</strong> Its main purpose is to provide type hints for dictionary structures, allowing type checkers to validate usage.</p></li><li><p><strong>Runtime Behavior:</strong> At runtime, a <code>TypedDict</code> behaves exactly like a normal <code>dict</code>.</p></li><li><p><strong>Required vs. Optional Keys:</strong> You can specify if keys are required or optional.</p></li></ul><p><strong>When to use </strong><code>TypedDict</code><strong>:</strong></p><ul><li><p>When you need a dictionary-like structure with specific keys and types.</p></li><li><p>When the primary goal is to improve code readability and enable static type checking for dictionary schemas.</p></li><li><p>When you need mutability and the flexibility of a dictionary.</p></li><li><p>For defining the expected structure of JSON data or API responses.</p></li></ul><pre><code>from typing import TypedDict

# Define a TypedDict
class User(TypedDict):
    name: str
    age: int
    email: str | None # Optional key

# Create an instance
user1: User = {'name': 'Alice', 'age': 30}
user2: User = {'name': 'Bob', 'age': 25, 'email': 'bob@example.com'}

print(user1)
print(user2)

user1['age'] = 31 # Mutable
print(user1)</code></pre><h3>&#9889;Static Type Checking</h3><pre><code># user3: User = {'name': 'Charlie'} # MyPy would flag this as missing 'age'</code></pre><p></p><h2>&#128073; <code>dataclasses</code></h2><p><code>dataclasses</code> (introduced in Python 3.7) provide a decorator that automatically generates special methods (like <code>__init__</code>, <code>__repr__</code>, <code>__eq__</code>) for classes primarily used to store data. They offer a good balance between the lightweight nature of <code>namedtuple</code> and the full flexibility of custom classes.</p><p><strong>Key Characteristics:</strong></p><ul><li><p><strong>Mutable by Default:</strong> Data class instances are mutable by default, but you can make them immutable using <code>frozen=True</code>.</p></li><li><p><strong>Regular Class:</strong> They are regular Python classes, allowing you to add custom methods and properties.</p></li><li><p><strong>Type Hinting:</strong> Designed to work seamlessly with type hints, which are used by the decorator to generate methods.</p></li><li><p><strong>Readability:</strong> Reduces boilerplate code for data-holding classes.</p></li><li><p><strong>Runtime Type Checking (Not by default):</strong> Type hints are for static analysis, but you can add runtime checks if needed (e.g., using <code>pydantic</code>).</p></li></ul><p><strong>When to use </strong><code>dataclasses</code><strong>:</strong></p><ul><li><p>When you need a class primarily for storing data.</p></li><li><p>When you want to reduce boilerplate code for <code>__init__</code>, <code>__repr__</code>, <code>__eq__</code>, etc.</p></li><li><p>When you need control over mutability (can be mutable or immutable).</p></li><li><p>When you might want to add custom methods or properties later.</p></li><li><p>For most general-purpose data structures in modern Python.</p></li></ul><pre><code>from dataclasses import dataclass

# Define a dataclass
@dataclass
class Product:
    name: str
    price: float
    quantity: int = 0 # Default value

# Create instances
p1 = Product("Laptop", 1200.0)
p2 = Product("Mouse", 25.50, 100)

print(p1)
print(p2)

p1.quantity = 5 # Mutable by default
print(p1)

# Make it immutable
@dataclass(frozen=True)
class ImmutablePoint:
    x: int
    y: int

ip = ImmutablePoint(5, 10)
# ip.x = 15 # This would raise a dataclasses.FrozenInstanceError</code></pre><h2>&#128165;When to choose which</h2><ul><li><p><code>namedtuple</code><strong>:</strong> For very simple, immutable data records where extreme memory efficiency is paramount, and you don't need custom methods or complex behavior.</p></li><li><p><code>TypedDict</code><strong>:</strong> When you're primarily working with dictionary-like data (e.g., JSON payloads, API responses) and want to add type hints for better static analysis and code clarity, while retaining dictionary flexibility.</p></li><li><p><code>dataclasses</code><strong>:</strong> For most modern Python applications where you need a clean, concise way to define data-holding classes. They offer a good balance of features, readability, and control over mutability, making them a versatile choice for structured data. If you find yourself writing <code>__init__</code> and <code>__repr__</code> methods just to store data, <code>dataclasses</code> are likely what you need.</p></li></ul><div class="pullquote"><p>Happy Data Handling &#128578;</p></div>]]></content:encoded></item><item><title><![CDATA[Python hierarchal Polymorphism & Duck Typing]]></title><description><![CDATA[Python gives us two elegant ways to reuse and organize code:]]></description><link>https://anujyadav.substack.com/p/python-hierarchal-polymorphism-and</link><guid isPermaLink="false">https://anujyadav.substack.com/p/python-hierarchal-polymorphism-and</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Fri, 23 May 2025 00:10:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Python gives us two elegant ways to reuse and organize code:<br>&#128073; Inheritance<br>&#128073; Duck Typing (a very "Pythonic" thing!)</p><p></p><blockquote><p>We will learn via code and less by words.</p></blockquote><h2>A simple Pizza class</h2><pre><code>class Pizza:
    # constructor
    def __init__(self, toppings, size):
        # data members
        self.toppings = toppings
        self.size = size
    
    # methods
    def bake(self):
        print(f"Baking a {self.size}-inch pizza with {', '.join(self.toppings)} toppings!")

# Creating a pizza object (using the recipe)
my_pizza = Pizza(["cheese", "tomato", "pepperoni"], 12)
# calling the bake method
my_pizza.bake()</code></pre><h2>Example of Inheritance</h2><blockquote><p>Classic object-oriented hierarchy &#8212; readable, structured, and scalable.</p></blockquote><pre><code>class VeganPizza(Pizza):  # Inherits from Pizza
    # constructor
    def __init__(self, size):
        # super() is used to call the constructor of the parent class
        super().__init__(["tomato", "vegan cheese", "bell peppers"], size)
    
    # overriding the bake method
    def bake(self):
        print(f"Baking a {self.size}-inch VEGAN pizza with {', '.join(self.toppings)} toppings!")

# Creating a VeganPizza object
vegan_order = VeganPizza(10)
vegan_order.bake()</code></pre><h2>Polymorphism &amp; Duck Typing</h2><blockquote><p>&#129414; Duck Typing &#8212; "If it quacks like a duck, it is a duck"</p><p><code>bake_pizza</code> expect the argument to have a <code>bake</code> method. That&#8217;s it!</p><p>No need for inheritance or interfaces.<br>If an object has a <code>bake()</code> method, we just trust it can bake.<br><em>"Don't check types. Just call methods."</em></p></blockquote><pre><code># Inheirts from Pizza
class DeepDishPizza(Pizza):
    # overriding the bake method
    def bake(self):
        print(f"Baking a {self.size}-inch DEEP DISH pizza for 40 minutes!")

# Inheirts from Pizza
class ThinCrustPizza(Pizza):
    # overriding the bake method
    def bake(self):
        print(f"Baking a {self.size}-inch THIN CRUST pizza for 15 minutes!")

deep_dish = DeepDishPizza(["mozzarella", "tomato"], 14)
thin_crust = ThinCrustPizza(["cheese", "olives"], 12)

# Duck typing
def bake_pizza(pizza):
    pizza.bake()
    
# polymorphism
bake_pizza(deep_dish)
bake_pizza(thin_crust)</code></pre><blockquote><h3>&#129300; Why does this matter?</h3><p>&#128313; Inheritance gives structure &#8212; useful when building frameworks.<br>&#128313; Duck typing gives flexibility &#8212; great for rapid, expressive code.</p><p>&#128204; Use inheritance when you want a formal hierarchy.<br>&#128204; Use duck typing when behavior matters more than ancestry.</p></blockquote><p></p><h2>&#128073; Bonus: Abstract Classes</h2><blockquote><p>Abstract classes work as <strong>templates</strong>, ensuring all subclasses follow a standard cooking method.</p></blockquote><p><strong>Example: Defining a Pizza Blueprint</strong></p><pre><code>from abc import ABC, abstractmethod

class AbstractPizza(ABC):  # Defining abstract class
    @abstractmethod
    def bake(self):
        pass  # Every subclass must implement this method

class Margherita(AbstractPizza):
    def bake(self):
        print("Baking Margherita pizza for 25 minutes!")

class Pepperoni(AbstractPizza):
    def bake(self):
        print("Baking Pepperoni pizza for 30 minutes!")

margherita = Margherita()
margherita.bake()</code></pre><div class="pullquote"><p>Hope you will cook better now &#128522;</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Functional Python Magic]]></title><description><![CDATA[Here is a quick note on functional Pythonic beauty!]]></description><link>https://anujyadav.substack.com/p/functional-python-magic</link><guid isPermaLink="false">https://anujyadav.substack.com/p/functional-python-magic</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Thu, 22 May 2025 05:03:49 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Here is a quick note on functional Pythonic beauty!</p><pre><code>
# A pure function always returns the same output for the same 
# input and has no side effects
def multiply(x, y):
    return x * y

# A pure function with safe side effects. Returns the same output 
# for the same input and does not mutate the input (immutablity)
def pure_double(list):
    new_list = list.copy()
    # double each element
    for i in range(len(new_list)):
        new_list[i] *= 2
    return new_list
 

# First-Class functions - functions can be assigned to variables or
# passed as arguments
def greet(name):
    return f'Hello {name}'

# First-Class functions. Higher-order function - function that takes
# a function as an argument
def executor(f, name):
    # execute the function
    return f(name)

print('Higher Order Function:',executor(greet, 'John'))

# First-Class functions - Assign function to a variable
message = greet
# use the function
print('Assign function to a variable:',message('John'))

# lambda functions
double = lambda x: x * 2
# use the function
print('Lambda function',double(5))


# Map, Filter, Reduce
from functools import reduce
numbers = [1, 2, 3, 4, 5]

# Map: Create a new list by applying a function to each item in an iterable
doubled = list(map(lambda x: x * 2, numbers))
print('Map function:',doubled)

# Filter: Create a new list by keeping only the items that pass a test
evens = list(filter(lambda x: x % 2 == 0, numbers))
print('Filter function:',evens)

# Reduce: Apply a function to a sequence of elements and return a single value (Aggregate)
sum = reduce(lambda x, y: x + y, numbers)
print('Reduce function:',sum)

# Decorator - function that takes a function as an argument and 
# returns a new function with added functionality
def logit(func):
    def with_logging(*args, **kwargs):
        print(func.__name__ + " was called")
        return func(*args, **kwargs)

    return with_logging

# Use the decorator
@logit
def add(x, y):
    return x + y

# Use the decorator
@logit
def sub(x, y):
    return x - y

# Use the decorated functions
print('Decorator:',add(2, 3))
print('Decorator:',sub(2, 3))</code></pre><p></p><p>Intentionally did not add recursion, as in Python, things could get slower with recursion. Explanation:</p><blockquote><p>Recursion in Python, especially in a functional style, can be slower than iterative approaches due to several factors. Each recursive call adds a new frame to the call stack, leading to function call overhead and increased memory usage. Python also has a recursion depth limit to prevent stack overflow errors, which can restrict the use of deep recursion.</p><p>Techniques like memoization (caching results of previous calls) and tail call optimization (converting recursion to iteration) can improve performance, but Python doesn't implement tail call optimization by default. Therefore, for performance-critical applications, iterative solutions are often preferred over recursive ones in Python.</p></blockquote><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Pandas: Your Go-To for Data Analysis in Python, SQL for Python with flexibility]]></title><description><![CDATA[If you're looking to manipulate, analyze, and clean data efficiently in Python, Pandas is an indispensable library.]]></description><link>https://anujyadav.substack.com/p/pandas-your-go-to-for-data-analysis</link><guid isPermaLink="false">https://anujyadav.substack.com/p/pandas-your-go-to-for-data-analysis</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Wed, 21 May 2025 05:53:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9289b268-b2d7-421d-887c-b0bdf504e09b_669x614.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you're looking to manipulate, analyze, and clean data efficiently in Python, Pandas is an indispensable library. Built on top of NumPy, it provides powerful and flexible data structures that make working with tabular data intuitive and fast.</p><p>This tutorial will introduce you to the core concepts of Pandas, focusing on its two primary data structures: <code>Series</code> and <code>DataFrame</code>. We'll explore common operations with practical examples, including loading data, inspecting it, selecting subsets, filtering, and performing basic aggregations.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Why Pandas?</h2><ul><li><p><strong>Powerful Data Structures:</strong> <code>Series</code> for 1D data and <code>DataFrame</code> for 2D tabular data.</p></li><li><p><strong>Easy Data Loading:</strong> Read data from various formats like CSV, Excel, SQL databases, JSON, and more.</p></li><li><p><strong>Flexible Data Manipulation:</strong> Tools for cleaning, transforming, merging, and reshaping data.</p></li><li><p><strong>Robust Analysis:</strong> Functions for statistical analysis, aggregation, and time series operations.</p></li><li><p><strong>Integration:</strong> Works seamlessly with other Python libraries like Matplotlib, Seaborn, and Scikit-learn.</p></li></ul><h2>Installation</h2><p>If you don't have Pandas installed, you can do so using pip:</p><pre><code>pip install pandas numpy matplotlib seaborn</code></pre><h2>Importing Pandas</h2><p>It's standard practice to import Pandas with the alias <code>pd</code>:</p><pre><code>import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns # We'll use this for some built-in datasets</code></pre><h3>1. Pandas Series: The 1D Data Structure</h3><p>A <code>Series</code> is a one-dimensional labeled array capable of holding any data type (integers, strings, floats, Python objects, etc.). It's similar to a column in a spreadsheet or a SQL table, or a NumPy array with an added index.</p><p><strong>Creating a Series:</strong></p><pre><code># From a list
s1 = pd.Series([10, 20, 30, 40, 50])
print("Series from a list:\n", s1)

# From a list with a custom index
s2 = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
print("\nSeries with custom index:\n", s2)

# From a dictionary (keys become the index)
data = {'apple': 10, 'banana': 20, 'cherry': 30}
s3 = pd.Series(data)
print("\nSeries from a dictionary:\n", s3)

# Accessing elements
print("\nElement at index 0 (s1):", s1[0])
print("Element at custom index 'b' (s2):", s2['b'])
print("Element at custom index 'cherry' (s3):", s3['cherry'])

# Slicing a Series
print("\nSlicing s1 (first 3 elements):\n", s1[:3])</code></pre><h3>2. Pandas DataFrame: The 2D Tabular Data Structure</h3><p>A <code>DataFrame</code> is a two-dimensional labeled data structure with columns of potentially different types. It's like a spreadsheet, a SQL table, or a dictionary of <code>Series</code> objects. It's the most commonly used Pandas object.</p><p><strong>Creating a DataFrame:</strong></p><pre><code># From a dictionary of lists
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'London', 'Paris', 'Tokyo']
}
df1 = pd.DataFrame(data)
print("DataFrame from a dictionary of lists:\n", df1)

# From a list of dictionaries
data_list = [
    {'Name': 'Eve', 'Age': 22, 'City': 'Berlin'},
    {'Name': 'Frank', 'Age': 28, 'City': 'Rome'},
    {'Name': 'Grace', 'Age': 33, 'City': 'Madrid'}
]
df2 = pd.DataFrame(data_list)
print("\nDataFrame from a list of dictionaries:\n", df2)</code></pre><h3>3. Loading Data into a DataFrame</h3><p>One of Pandas' most powerful features is its ability to load data from various file formats.</p><p><strong>Loading from a CSV file:</strong></p><p>For this example, let's use the <code>tips</code> dataset, which is a common dataset for demonstrating data analysis. Seaborn provides convenient functions to load such datasets directly.</p><pre><code># Load a built-in dataset from Seaborn
tips = sns.load_dataset('tips')

# Quickly peek into the data with head and tail
print("\nFirst 5 rows of the 'tips' dataset:\n", tips.head())
print("\nLast 3 rows of the 'tips' dataset:\n", tips.tail(3))

# Inspecting DataFrames
</code></pre><p></p><h3>4. Inspecting DataFrames</h3><pre><code># Get basic information about the DataFrame
print("\nDataFrame Info:\n")
tips.info()

# Get descriptive statistics for numerical columns
print("\nDescriptive Statistics:\n", tips.describe())

# Get the shape (rows, columns) of the DataFrame
print("\nShape of DataFrame (rows, columns):", tips.shape)

# Get column names
print("\nColumn names:", tips.columns)

# Get data types of each column
print("\nData types:\n", tips.dtypes)

# Get unique values in a column
print("\nUnique days:", tips['day'].unique())

# Get the count of unique values in a column
print("\nValue counts for 'smoker':\n", tips['smoker'].value_counts())</code></pre><h3>5. Selecting Data (Columns and Rows)</h3><p><strong>Selecting Columns:</strong></p><pre><code># Select a single column (returns a Series)
total_bill = tips['total_bill']
print("\n'total_bill' column (Series):\n", total_bill.head())

# Select multiple columns (returns a DataFrame)
subset_df = tips[['total_bill', 'tip', 'day']]
print("\n'total_bill', 'tip', 'day' columns (DataFrame):\n", subset_df.head())</code></pre><p><strong>Selecting Rows using </strong><code>loc</code><strong> and </strong><code>iloc</code><strong>:</strong></p><ul><li><p><code>.loc[]</code>: Label-based indexing (selects by row/column labels).</p></li><li><p><code>.iloc[]</code>: Integer-location based indexing (selects by integer position).</p></li></ul><pre><code># Select row by integer position (iloc)
first_row = tips.iloc[0]
print("\nFirst row (iloc[0]):\n", first_row)

# Select rows by a range of integer positions (iloc)
rows_0_to_2 = tips.iloc[0:3] # rows 0, 1, 2
print("\nRows 0 to 2 (iloc[0:3]):\n", rows_0_to_2)

# Select specific rows and columns by integer position (iloc)
specific_cells = tips.iloc[[0, 2], [1, 3]] # Rows 0 and 2, columns 1 (tip) and 3 (sex)
print("\nSpecific cells (iloc[[0, 2], [1, 3]]):\n", specific_cells)

# Select rows by label (loc) - requires a named index, which 'tips' doesn't have by default.
# Let's set an index for demonstration
tips_indexed = tips.set_index('time')
print("\nTips DataFrame with 'time' as index:\n", tips_indexed.head())

# Now select by label
lunch_data = tips_indexed.loc['Lunch']
print("\nData for 'Lunch' (loc['Lunch']):\n", lunch_data.head())

# Select specific rows and columns by label (loc)
lunch_total_bill_tip = tips_indexed.loc['Lunch', ['total_bill', 'tip']]
print("\nTotal bill and tip for 'Lunch' (loc['Lunch', ['total_bill', 'tip']]):\n", lunch_total_bill_tip.head())</code></pre><h3>6. Filtering Data</h3><p>Filtering allows you to select rows based on one or more conditions.</p><pre><code># Filter for tips greater than $5
high_tip_payers = tips[tips['tip'] &gt; 5]
print("\nCustomers who paid more than $5 tip:\n", high_tip_payers.head())

# Filter for female smokers
female_smokers = tips[(tips['sex'] == 'Female') &amp; (tips['smoker'] == 'Yes')]
print("\nFemale smokers:\n", female_smokers.head())

# Filter for tips on Sunday or Saturday
weekend_tips = tips[tips['day'].isin(['Sun', 'Sat'])]
print("\nTips on Sunday or Saturday:\n", weekend_tips.head())</code></pre><h3>7. Adding and Modifying Columns</h3><pre><code># Add a new column: 'tip_percentage'
tips['tip_percentage'] = (tips['tip'] / tips['total_bill']) * 100
print("\nDataFrame with 'tip_percentage' column:\n", tips.head())

# Modify an existing column (e.g., round 'tip_percentage')
tips['tip_percentage'] = tips['tip_percentage'].round(2)
print("\nDataFrame with rounded 'tip_percentage':\n", tips.head())

# Create a new categorical column based on a condition
tips['generous_tipper'] = np.where(tips['tip_percentage'] &gt; 20, 'Yes', 'No')
print("\nDataFrame with 'generous_tipper' column:\n", tips.head())</code></pre><h3>8. Handling Missing Data</h3><p>Pandas uses <code>NaN</code> (Not a Number) to represent missing values.</p><pre><code># Create a DataFrame with missing values for demonstration
df_missing = pd.DataFrame({
    'A': [1, 2, np.nan, 4],
    'B': [5, np.nan, 7, 8],
    'C': [9, 10, 11, np.nan]
})
print("\nDataFrame with missing values:\n", df_missing)

# Check for missing values
print("\nMissing values (isnull):\n", df_missing.isnull())
print("\nTotal missing values per column:\n", df_missing.isnull().sum())

# Drop rows with any missing values
df_dropped_rows = df_missing.dropna()
print("\nDataFrame after dropping rows with NaN:\n", df_dropped_rows)

# Fill missing values with a specific value (e.g., 0)
df_filled_zero = df_missing.fillna(0)
print("\nDataFrame after filling NaN with 0:\n", df_filled_zero)

# Fill missing values with the mean of the column
df_filled_mean = df_missing.fillna(df_missing.mean())
print("\nDataFrame after filling NaN with column mean:\n", df_filled_mean)</code></pre><h3>9. Grouping and Aggregating Data</h3><pre><code># Group by 'day' and calculate the mean of numerical columns
daily_avg = tips.groupby('day').mean(numeric_only=True)
print("\nDaily Averages:\n", daily_avg)

# Group by 'day' and 'sex' and calculate the sum of 'total_bill'
gender_day_total_bill = tips.groupby(['day', 'sex'])['total_bill'].sum()
print("\nTotal bill by day and gender:\n", gender_day_total_bill)

# Group by 'smoker' and calculate multiple aggregations
smoker_stats = tips.groupby('smoker').agg(
    avg_tip=('tip', 'mean'),
    max_bill=('total_bill', 'max'),
    count=('total_bill', 'count')
)
print("\nSmoker Statistics:\n", smoker_stats)</code></pre><h3>10. Merging and Joining DataFrames</h3><pre><code># Create two dummy DataFrames
df_customers = pd.DataFrame({
    'customer_id': [1, 2, 3, 4],
    'name': ['Alice', 'Bob', 'Charlie', 'David']
})

df_orders = pd.DataFrame({
    'order_id': [101, 102, 103, 104, 105],
    'customer_id': [1, 3, 2, 1, 5], # Customer 5 does not exist in df_customers
    'amount': [50, 75, 20, 100, 30]
})

print("\nCustomers DataFrame:\n", df_customers)
print("\nOrders DataFrame:\n", df_orders)

# Inner merge (only common customer_ids)
merged_inner = pd.merge(df_customers, df_orders, on='customer_id', how='inner')
print("\nInner Merge (customers with orders):\n", merged_inner)

# Left merge (all customers, even if no orders)
merged_left = pd.merge(df_customers, df_orders, on='customer_id', how='left')
print("\nLeft Merge (all customers, with NaNs for no orders):\n", merged_left)

# Right merge (all orders, even if customer_id not in customers)
merged_right = pd.merge(df_customers, df_orders, on='customer_id', how='right')
print("\nRight Merge (all orders, with NaNs for unknown customers):\n", merged_right)

# Outer merge (all records from both, filling NaNs)
merged_outer = pd.merge(df_customers, df_orders, on='customer_id', how='outer')
print("\nOuter Merge (all records from both):\n", merged_outer)</code></pre><div class="pullquote"><p>Pandas is like SQL for Python, with flexibility. Hope you liked it.</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[A Beginner's Guide to Beautiful Data Visualization with Seaborn]]></title><description><![CDATA[Data visualization is a crucial aspect of data analysis and communication.]]></description><link>https://anujyadav.substack.com/p/a-beginners-guide-to-beautiful-data</link><guid isPermaLink="false">https://anujyadav.substack.com/p/a-beginners-guide-to-beautiful-data</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Wed, 21 May 2025 03:39:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!d1fM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Data visualization is a crucial aspect of data analysis and communication. It allows us to understand complex datasets, identify patterns, and extract meaningful insights. Python, with its rich ecosystem of libraries, offers powerful tools for creating compelling visualizations. Among these, Seaborn stands out as a library specifically designed for statistical data visualization.</p><h2>What is Seaborn?</h2><p>Seaborn is a Python data visualization library built on top of Matplotlib. While Matplotlib provides a solid foundation for creating a wide range of plots, Seaborn offers a higher-level interface with more aesthetic defaults and specialized functions for statistical plots. In essence, Seaborn makes it easier to create beautiful and informative visualizations with less code.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Key Features of Seaborn</h2><ul><li><p><strong>Simplified Syntax:</strong> Seaborn provides a concise and intuitive syntax, reducing the amount of code needed to create complex plots.</p></li><li><p><strong>Statistical Focus:</strong> Seaborn is designed to handle statistical data, offering functions for visualizing distributions, relationships between variables, and categorical data.</p></li><li><p><strong>Integration with Pandas:</strong> Seaborn seamlessly integrates with Pandas DataFrames, making it easy to visualize data directly from structured datasets.</p></li><li><p><strong>Attractive Defaults:</strong> Seaborn comes with a set of aesthetically pleasing default styles, making your plots look polished and professional.</p></li><li><p><strong>Customization:</strong> While Seaborn provides sensible defaults, it also allows for extensive customization to fine-tune the appearance of your plots.</p></li></ul><h2>Getting Started with Seaborn</h2><p>To use Seaborn, you first need to install it. If you have Python and pip installed, you can simply run the following command in your terminal or command prompt:</p><pre><code>pip install seaborn</code></pre><blockquote><p>I would highly recommend to use virtual environment</p></blockquote><p>Once installed, you can import Seaborn and friends in your Python script or Jupyter Notebook:</p><pre><code>import seaborn as sns
import matplotlib.pyplot as plt</code></pre><h2>Visualizing Data with Seaborn</h2><p>Let's explore some of the fundamental Seaborn functions with examples using the "Iris" dataset, a classic dataset for classification tasks.</p><p><strong>1. Visualizing Distributions</strong></p><ul><li><p><strong>Histograms:</strong> Histograms provide a visual representation of the distribution of a single variable.</p><pre><code># Load the Iris dataset
iris = sns.load_dataset('iris')

# Create a histogram of the 'sepal_length' column
sns.histplot(iris['sepal_length'], kde=True)  
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Frequency')
plt.title('Distribution of Sepal Length')
plt.show()</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d1fM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d1fM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d1fM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png" width="640" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23948,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/163999256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d1fM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!d1fM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F356b7a19-16d6-42e3-9ea9-dfb67db5ca4e_640x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>This code creates a histogram showing the frequency of different sepal lengths in the Iris dataset. The kde=True argument adds a kernel density estimate curve, providing a smoother representation of the distribution.</p><p></p></li><li><p><strong>Kernel Density Plots (KDEs):</strong> KDE plots provide a smooth estimate of the probability density function of a continuous variable.</p><pre><code># Create a KDE plot of the 'sepal_width' column
sns.kdeplot(iris['sepal_width'])
plt.xlabel('Sepal Width (cm)')
plt.ylabel('Density')
plt.title('KDE Plot of Sepal Width')
plt.show()</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Amkd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Amkd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Amkd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png" width="640" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25411,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/163999256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Amkd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!Amkd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffd39e580-1355-4bde-9b3c-8cb5306ad2d8_640x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p></li><li><p><strong>Box Plots:</strong> Box plots display the distribution of data based on quartiles, showing the median, interquartile range, and potential outliers.</p><pre><code># Create a box plot of 'petal_length' by 'species'
sns.boxplot(x='species', y='petal_length', data=iris)
plt.xlabel('Species')
plt.ylabel('Petal Length (cm)')
plt.title('Box Plot of Petal Length by Species')
plt.show()</code></pre><p>This code generates a box plot comparing the distribution of petal lengths across different Iris species.</p></li><li><p><strong>Violin Plots:</strong> Violin plots are similar to box plots but provide a richer representation of the data distribution, showing the probability density of the data at different values.</p><pre><code># Create a violin plot of 'petal_width' by 'species'
sns.violinplot(x='species', y='petal_width', data=iris, inner='quartile')
plt.xlabel('Species')
plt.ylabel('Petal Width (cm)')
plt.title('Violin Plot of Petal Width by Species')
plt.show()</code></pre></li></ul><p><strong>2. Visualizing Relationships Between Variables</strong></p><ul><li><p><strong>Scatter Plots:</strong> Scatter plots are used to visualize the relationship between two quantitative variables.</p><pre><code># Create a scatter plot of 'sepal_length' vs. 'sepal_width'
sns.scatterplot(x='sepal_length', y='sepal_width', hue='species', data=iris)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Scatter Plot of Sepal Length vs. Sepal Width')
plt.show()</code></pre><p>The <code>hue</code> argument adds a third dimension to the plot by coloring the points based on the 'species' column.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WSPI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WSPI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WSPI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png" width="640" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42515,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/163999256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WSPI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!WSPI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7696d3-0b9f-4a68-829b-f8da07a2ee6a_640x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p></li><li><p><strong>Regression Plots:</strong> Regression plots visualize the linear relationship between two variables and fit a regression line to the data.</p><pre><code># Create a regression plot of 'sepal_length' vs. 'sepal_width'
sns.regplot(x='sepal_length', y='sepal_width', data=iris)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('Regression Plot of Sepal Length vs. Sepal Width')
plt.show()</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r7y0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r7y0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r7y0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png" width="640" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:640,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/163999256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r7y0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 424w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 848w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 1272w, https://substackcdn.com/image/fetch/$s_!r7y0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2412b082-2f5c-430c-87eb-07f94bb44687_640x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p></li><li><p><strong>Pair Plots:</strong> Pair plots visualize the relationships between all pairs of variables in a dataset.</p><pre><code># Create a pair plot of the Iris dataset
sns.pairplot(iris, hue='species')
plt.title('Pair Plot of Iris Dataset')
plt.show()</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!330f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!330f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 424w, https://substackcdn.com/image/fetch/$s_!330f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 848w, https://substackcdn.com/image/fetch/$s_!330f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 1272w, https://substackcdn.com/image/fetch/$s_!330f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!330f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png" width="965" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:965,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:207363,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/163999256?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!330f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 424w, https://substackcdn.com/image/fetch/$s_!330f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 848w, https://substackcdn.com/image/fetch/$s_!330f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 1272w, https://substackcdn.com/image/fetch/$s_!330f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F756c5fa6-f4a7-4de6-9b70-738a1b577967_965x799.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pair plots are a great way to quickly explore the relationships between multiple variables in your data.</p></li></ul><p><strong>3. Visualizing Categorical Data</strong></p><ul><li><p><strong>Count Plots:</strong> Count plots display the frequency of occurrences of different categories in a categorical variable.</p><pre><code># Create a count plot of the 'species' column
sns.countplot(x='species', data=iris)
plt.xlabel('Species')
plt.ylabel('Count')
plt.title('Count Plot of Species')
plt.show()</code></pre></li><li><p><strong>Bar Plots:</strong> Bar plots are used to visualize the relationship between a categorical variable and a quantitative variable.</p><pre><code># Create a bar plot of average 'sepal_length' by 'species'
sns.barplot(x='species', y='sepal_length', data=iris)
plt.xlabel('Species')
plt.ylabel('Average Sepal Length (cm)')
plt.title('Bar Plot of Average Sepal Length by Species')
plt.show()</code></pre></li></ul><h2>Conclusion</h2><p>Seaborn is a powerful and versatile data visualization library that simplifies the process of creating beautiful and informative statistical plots in Python. Whether you're exploring distributions, visualizing relationships between variables, or analyzing categorical data, Seaborn provides a wide range of functions and customization options to meet your needs. By mastering Seaborn, you can effectively communicate your data insights and gain a deeper understanding of your datasets.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Gen AI: Shot-Based Prompting]]></title><description><![CDATA[Prompt is a powerful tool to interact with LLMs.]]></description><link>https://anujyadav.substack.com/p/gen-ai-shot-based-prompting</link><guid isPermaLink="false">https://anujyadav.substack.com/p/gen-ai-shot-based-prompting</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Tue, 20 May 2025 04:57:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Prompt is a powerful tool to interact with LLMs. There are many different types of prompting strategies. Every strategy is good for a specific scenario. </p><h2>Shot-Based Prompting</h2><p>We have "zero-shot", "one-shot", and "few-shot" prompting. The prefixs "zero", "one", and "few" refer to the number of input-output examples provided in the prompt to guide the LLM. This approach is also known as In-Context Learning (ICL), where the LLM learns directly from the examples within the prompt itself, without requiring explicit fine-tuning.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h3>Zero-Shot Prompting:</h3><p><strong>Concept</strong>: In zero-shot prompting, you give the LLM a direct instruction to perform a task  <strong>without providing any examples</strong> of the desired output. The model relies solely on its pre-trained knowledge to understand the task and generate a response.</p><p><strong>Analogy</strong>: Imagine asking someone to translate a sentence into another language without ever showing them an example of translation. They have to rely on their general knowledge of languages and translation principles.</p><p><strong>Example</strong>: The LLM is expected to use its understanding of English and French to produce the translation: "Bonjour, comment allez-vous?"</p><pre><code>Prompt: Translate the following English text to French: "Hello, how are you?"</code></pre><p><strong>Use Cases</strong>: Suitable for simple and well-understood tasks that the LLM has likely encountered during its training, such as basic classification, general queries, and simple translations.</p><p><strong>Advantages</strong>:</p><ul><li><p>Requires no task-specific examples, saving time and effort in prompt creation.</p></li><li><p>Demonstrates the model's inherent capabilities and generalization from pre-training.</p></li></ul><p><strong>Limitations</strong>:</p><ul><li><p>Performance can be unpredictable or inaccurate for complex or nuanced tasks.</p></li><li><p>The model might struggle with specific output formats or styles without guidance.</p></li></ul><h3>One-Shot Prompting:</h3><p><strong>Concept</strong>: One-shot prompting enhances zero-shot prompting by providing the LLM with <strong>a single example</strong> of the desired input-output format before the actual task instruction. This single example helps clarify the task and improve the model's understanding of what is expected.</p><p><strong>Analogy</strong>: Now, before asking someone to translate, you show them one example of an English sentence and its French translation.</p><p><strong>Example</strong>: The LLM sees the example "Good morning" -&gt; "Bonjour" and then is asked to translate "How are you?". This single example can guide the model to provide a more accurate and appropriately formatted response.</p><pre><code>Prompt: Translate English to French: English: Good morning French: Bonjour English: How are you? French:</code></pre><p><strong>Use Cases</strong>: Helpful for tasks that need more specific guidance or when the model struggles with ambiguity in a zero-shot setting, such as basic classification with specific labels or structured information extraction.</p><p><strong>Advantages</strong>:</p><ul><li><p> Provides some context and clarifies the expected output format.</p></li><li><p> Can lead to better performance than zero-shot prompting for slightly more complex tasks.</p></li></ul><p><strong>Limitations</strong>:</p><ul><li><p> A single example might not be sufficient for highly complex or nuanced tasks.</p></li><li><p> The choice of the single example can significantly impact the model's output.</p></li></ul><h3>Few-Shot Prompting:</h3><p><strong>Concept</strong>: Few-shot prompting takes the idea of providing examples further by including <strong>multiple (typically 2-5)</strong> input-output examples in the prompt before the final task instruction. These examples help the model recognize patterns, understand complex tasks, and adhere to specific output formats or styles.</p><p><strong>Analogy</strong>: You now provide several examples of English sentences and their French translations before asking for a new translation.</p><p><strong>Example</strong>: The LLM learns from the provided examples how to associate certain phrases with "Positive" or "Negative" sentiment and can then apply this pattern to the new review.</p><pre><code>Prompt: Classify the sentiment of the following movie reviews as positive or negative.  
Review: "This movie was amazing and I loved every second." Sentiment: Positive Review: "The plot was confusing and the acting was terrible." Sentiment: Negative 
Review: "I enjoyed it, but it wasn't anything special." Sentiment: Positive Review: "This film is a complete waste of time." Sentiment:</code></pre><p><strong>Use Cases</strong>: Best suited for complex tasks requiring specific output formats, nuanced understanding, or when higher accuracy and consistency are needed, such as generating structured outputs, handling nuanced classifications, or following specific writing styles.</p><p><strong>Advantages</strong>:</p><ul><li><p> Significantly improves performance on complex tasks compared to zero-shot and one-shot prompting.</p></li><li><p> Allows the model to learn from patterns and adhere to specific output requirements.</p></li><li><p>Reduces the need for extensive fine-tuning data.</p></li></ul><p><strong>Limitations</strong>:</p><ul><li><p> Prompts can become long, potentially exceeding the model's context window limits.</p></li><li><p>The quality and diversity of the examples are crucial for performance. Biased or poorly chosen examples can lead to suboptimal results.</p></li><li><p>Can still be less effective than fine-tuning on a large, task-specific dataset for highly specialized tasks.</p></li></ul><p></p><div class="pullquote"><p>We will keep on learning more in this Gen AI journey. Subscribe to stay updated.</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Understanding PEFT: Parameter-Efficient Fine-Tuning for Large Language Models]]></title><description><![CDATA[In today's rapidly evolving AI landscape, Large Language Models (LLMs) like GPT-4, Claude, and LLaMA have revolutionized natural language processing.]]></description><link>https://anujyadav.substack.com/p/understanding-peft-parameter-efficient</link><guid isPermaLink="false">https://anujyadav.substack.com/p/understanding-peft-parameter-efficient</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Tue, 06 May 2025 10:24:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today's rapidly evolving AI landscape, Large Language Models (LLMs) like GPT-4, Claude, and LLaMA have revolutionized natural language processing. However, adapting these massive models to specific tasks presents significant challenges. This is where Parameter-Efficient Fine-Tuning (PEFT) techniques come in - offering clever ways to customize LLMs without breaking the bank.</p><h2>The Fine-Tuning Challenge</h2><p>Traditional fine-tuning involves updating all parameters of a pre-trained model for a specific task. For context, modern LLMs can have billions or even trillions of parameters:</p><ul><li><p>GPT-3: 175 billion parameters</p></li><li><p>LLaMA 2: Up to 70 billion parameters</p></li></ul><p>Full fine-tuning these massive models requires:</p><ul><li><p>Enormous computational resources</p></li><li><p>Significant GPU memory</p></li><li><p>Substantial energy consumption</p></li><li><p>Storage for multiple full model copies</p></li></ul><p>For most organizations, this approach is prohibitively expensive and environmentally unsustainable.</p><h2>Enter PEFT: The Efficient Alternative</h2><p>PEFT techniques address these limitations by modifying only a small subset of parameters while keeping most of the pre-trained model frozen. This approach offers several advantages:</p><ul><li><p><strong>Resource efficiency</strong>: Requires significantly less computing power</p></li><li><p><strong>Storage efficiency</strong>: Smaller parameter footprint</p></li><li><p><strong>Catastrophic forgetting prevention</strong>: Preserves general knowledge</p></li><li><p><strong>Adaptability</strong>: Easier to deploy across different tasks</p></li></ul><h2>Key PEFT Techniques Explained</h2><p>Let's dive into several popular PEFT methods with their mathematical foundations:</p><h3>1. LoRA (Low-Rank Adaptation)</h3><p>LoRA focuses on representing parameter updates through low-rank decomposition matrices.</p><p><strong>Mathematical concept</strong>: Instead of updating a weight matrix W directly, LoRA introduces two smaller matrices A and B:</p><p>&#916;W = A&#183;B</p><p>Where:</p><ul><li><p>A &#8712; &#8477;^(d&#215;r)</p></li><li><p>B &#8712; &#8477;^(r&#215;k)</p></li><li><p>r &lt;&lt; min(d,k)</p></li></ul><p>The modified forward pass becomes: y = x&#183;(W + &#916;W) = x&#183;W + x&#183;A&#183;B</p><p><strong>Example</strong>: For a 1000&#215;1000 weight matrix (1M parameters), with rank r=8:</p><ul><li><p>A would be 1000&#215;8 (8K parameters)</p></li><li><p>B would be 8&#215;1000 (8K parameters)</p></li><li><p>Total trainable parameters: 16K (just 1.6% of original)</p></li></ul><p>LoRA works exceptionally well for attention mechanisms in transformer architectures, often approaching full fine-tuning performance with just 0.1-1% of the parameters.</p><h3>2. Prompt Tuning</h3><p>Prompt tuning adds trainable continuous embeddings (soft prompts) to the input while keeping the model frozen.</p><p><strong>Mathematical concept</strong>: For input tokens X, we prepend or append trainable embeddings P:</p><p>X' = [P; X] or X' = [X; P]</p><p>Where:</p><ul><li><p>X &#8712; &#8477;^(n&#215;d) (n tokens with embedding dimension d)</p></li><li><p>P &#8712; &#8477;^(p&#215;d) (p trainable token embeddings)</p></li></ul><p><strong>Example</strong>: For a model with embedding dimension 768, adding 20 trainable prompt tokens means:</p><ul><li><p>P contains 20&#215;768 = 15,360 trainable parameters</p></li><li><p>For a 7B parameter model, this represents just 0.0002% of parameters</p></li></ul><h3>3. Prefix Tuning</h3><p>Prefix tuning extends prompt tuning by adding trainable parameters to each layer of the model.</p><p><strong>Mathematical concept</strong>: For a Transformer with L layers, we add prefixes Pi to each layer's key and value projections:</p><p>K'i = [Pk,i; Ki] V'i = [Pv,i; Vi]</p><p>Where:</p><ul><li><p>Pk,i and Pv,i &#8712; &#8477;^(p&#215;d) are trainable prefixes for layer i</p></li><li><p>Ki and Vi are the original key and value projections</p></li></ul><p><strong>Example</strong>: For a 12-layer transformer with 768-dimensional embeddings and 10 prefix tokens:</p><ul><li><p>Total trainable parameters: 12 layers &#215; 2 (keys and values) &#215; 10 tokens &#215; 768 dimensions = 184,320 parameters</p></li><li><p>Still a tiny fraction of the full model</p></li></ul><h3>4. Adapter Modules</h3><p>Adapters insert small trainable modules between layers of the frozen model.</p><p><strong>Mathematical concept</strong>: For a layer with transformation f, we insert an adapter module g:</p><p>y = f(x) + g(f(x))</p><p>Where g typically follows a bottleneck architecture: g(x) = W2&#183;&#963;(W1&#183;x)</p><p>With:</p><ul><li><p>W1 &#8712; &#8477;^(d&#215;b) (down-projection)</p></li><li><p>W2 &#8712; &#8477;^(b&#215;d) (up-projection)</p></li><li><p>b &lt;&lt; d (bottleneck dimension)</p></li><li><p>&#963; is a non-linear activation function</p></li></ul><p><strong>Example</strong>: For a transformer with hidden dimension 1024 and bottleneck dimension 64:</p><ul><li><p>Parameters per adapter: 1024&#215;64 + 64&#215;1024 = 131,072</p></li><li><p>Adding adapters after each attention and FFN layer in a 12-layer model: ~3M parameters (much less than the full model)</p></li></ul><h3>5. BitFit</h3><p>BitFit focuses exclusively on training the bias terms while keeping all other parameters frozen.</p><p><strong>Mathematical concept</strong>: For a transformation with weights W and biases b: y = W&#183;x + b</p><p>BitFit only updates b, keeping W frozen.</p><p><strong>Example</strong>: In a 7B parameter model, bias terms might account for only ~0.1% of parameters (7M), making BitFit extremely parameter-efficient.</p><h2>Mathematical Intuition Behind PEFT</h2><p>The effectiveness of PEFT techniques lies in the concept of low intrinsic dimensionality. Despite having billions of parameters, the actual functional changes needed to adapt a model to a specific task often lie in a much lower-dimensional subspace.</p><p>Consider the parameter space &#8477;^N of an LLM with N parameters. The task-specific adaptations often lie in a subspace &#8477;^M where M &lt;&lt; N. PEFT methods effectively find this lower-dimensional subspace, allowing efficient adaptation.</p><p>This can be formalized through Singular Value Decomposition (SVD) of the parameter update matrix: &#916;W = U&#931;V^T</p><p>Where many singular values in &#931; are close to zero, indicating that &#916;W has a low effective rank.</p><h2>Practical Applications and Examples</h2><h3>Example 1: Medical Domain Adaptation</h3><p><strong>Scenario</strong>: Adapting a general LLM for medical question answering</p><p><strong>PEFT approach</strong>: LoRA with r=16</p><ul><li><p>Freeze all 7B parameters of the base model</p></li><li><p>Train only 3M LoRA parameters (0.04% of full model)</p></li><li><p>Training time: 4 hours on a single GPU vs. 1 week for full fine-tuning</p></li><li><p>Performance: 96% of full fine-tuning accuracy</p></li></ul><h3>Example 2: Legal Document Analysis</h3><p><strong>Scenario</strong>: Fine-tuning for legal contract analysis</p><p><strong>PEFT approach</strong>: Prefix tuning with 50 prefix tokens</p><ul><li><p>Train only 1.5M parameters</p></li><li><p>Maintain model's general knowledge while specializing in legal terminology</p></li><li><p>Adaptable across multiple jurisdictions with separate small prefix sets</p></li></ul><h3>Example 3: Multilingual Adaptation</h3><p><strong>Scenario</strong>: Adapting an English-centric LLM for low-resource languages</p><p><strong>PEFT approach</strong>: Combination of adapter layers and prompt tuning</p><ul><li><p>Language-specific adapters (2M parameters per language)</p></li><li><p>Shared cross-lingual prompt tokens (20K parameters)</p></li><li><p>Results: Achieves 92% of full fine-tuning performance with only 0.03% of parameters</p></li></ul><h2>Recent Innovations in PEFT</h2><h3>QLoRA (Quantized LoRA)</h3><p>QLoRA combines parameter quantization with LoRA, enabling fine-tuning of even larger models on consumer hardware.</p><p><strong>Mathematical concept</strong>: The base model is quantized to 4 or 8 bits, while LoRA updates remain in full precision:</p><p>y = x&#183;Q(W) + x&#183;A&#183;B</p><p>Where Q(W) is the quantized version of the original weights.</p><p><strong>Example</strong>: Fine-tuning a 70B parameter model on a single consumer GPU with 24GB memory</p><ul><li><p>Base model quantized to 4-bit precision</p></li><li><p>LoRA rank r=16</p></li><li><p>20M trainable parameters (0.03% of full model)</p></li></ul><h3>ULoRA (Unified LoRA)</h3><p>ULoRA enables efficient transfer between different tasks by introducing task-specific vectors that modulate LoRA matrices.</p><p><strong>Mathematical concept</strong>: For each task t, the LoRA update becomes:</p><p>&#916;W_t = diag(v_t)&#183;A&#183;B</p><p>Where v_t is a learnable task vector that scales the contribution of different LoRA components.</p><h2>Conclusion</h2><p>Parameter-Efficient Fine-Tuning represents a critical development in making LLMs more accessible and practical. These techniques democratize access to state-of-the-art AI by reducing computational requirements while maintaining impressive performance.</p><p>The mathematical elegance of PEFT methods reveals a fundamental insight: adaptation often lies in low-dimensional subspaces of the parameter space. By identifying and focusing on these subspaces, we can efficiently specialize massive models for specific applications.</p><p>As LLMs continue to grow in size and capability, PEFT techniques will become increasingly essential for practical deployment, enabling organizations of all sizes to leverage the power of advanced AI while managing computational resources responsibly.</p>]]></content:encoded></item><item><title><![CDATA[The wonderful world of TF-IDF!]]></title><description><![CDATA[TF-IDF (Term Frequency-Inverse Document Frequency) is a fundamental concept in natural language processing and information retrieval.]]></description><link>https://anujyadav.substack.com/p/the-wonderful-world-of-tf-idf</link><guid isPermaLink="false">https://anujyadav.substack.com/p/the-wonderful-world-of-tf-idf</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sun, 04 May 2025 20:39:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>TF-IDF (Term Frequency-Inverse Document Frequency) is a fundamental concept in natural language processing and information retrieval. It's a technique used to weigh the importance of words in a document, taking into account both their frequency within that document and their rarity across an entire corpus.</p><h2>What problem does TF-IDF solve?</h2><p>Imagine you're building a search engine or a text classification model. You have a large collection of documents, each with its own set of keywords. Your goal is to:</p><ol><li><p>Identify the most important words in each document.</p></li><li><p>Compare the importance of these words across different documents.</p></li></ol><p>The challenge lies in deciding which words are truly significant and which ones are just noise. This is where TF-IDF comes in &#8211; it helps you quantify the relevance of each word in a document, considering both its frequency (how often it appears) and rarity (how uncommon it is across all documents).</p><h2>Term Frequency (TF)</h2><p>The first part of TF-IDF is Term Frequency (TF), which measures how frequently a word appears within a single document. It's calculated as the number of times a word occurs in a document, divided by the total number of words in that document.</p><p>Mathematically, TF can be represented as:</p><pre><code>TF(word, doc) = (number of occurrences of word in doc) / (total number of words in doc)</code></pre><p>For example, if you have a document with 100 words and the word "cat" appears 5 times, its TF would be <code>5/100 = 0.05</code>.</p><h2>Inverse Document Frequency (IDF)</h2><p>The second part is Inverse Document Frequency (IDF), which measures how rare or common a word is across all documents in the corpus. It's calculated as the logarithm of the ratio of the total number of documents to the number of documents containing that word.</p><p>Mathematically, IDF can be represented as:</p><pre><code>IDF(word) = log(total number of documents / (number of documents containing word))</code></pre><p>Using our previous example, if there are 1000 documents in the corpus and only 50 contain the word "cat", its IDF would be <code>log(1000/50) &#8776; 1.995</code>.</p><h2>Combining TF and IDF: TF-IDF</h2><p>To create a meaningful measure of importance for each word in a document, you combine the Term Frequency (TF) with the Inverse Document Frequency (IDF). This is done by multiplying the two values together:</p><pre><code>TF-IDF(word, doc) = TF(word, doc) * IDF(word)</code></pre><p>In our example, the TF-IDF score for the word "cat" would be <code>0.05 * 1.995 &#8776; 0.0995</code>.</p><h2>Interpretation and Applications</h2><p>A high TF-IDF score indicates that a word is both frequent in a document (high TF) and rare across all documents (high IDF). This suggests that the word is important for distinguishing this particular document from others.</p><p>TF-IDF has numerous applications, including:</p><ul><li><p><strong>Document classification</strong>: Use TF-IDF to select the most informative features (words) for a given document.</p></li><li><p><strong>Search engines</strong>: Implement TF-IDF-based ranking algorithms to retrieve relevant documents in response to user queries.</p></li><li><p><strong>Information retrieval</strong>: Utilize TF-IDF to identify the most representative words for a query or topic.</p></li><li><p><strong>Text summarization</strong>: Use TF-IDF to select the most important sentences or phrases from a document.</p></li></ul><h2>Limitations and Variations</h2><p>While TF-IDF is a powerful technique, it has some limitations:</p><ul><li><p><strong>Stopwords</strong>: Common words like "the", "and", etc., which are not informative, can skew TF-IDF scores.</p></li><li><p><strong>Overlapping documents</strong>: If multiple documents contain the same rare word, IDF may underestimate its importance.</p></li></ul><p>To address these issues, you can modify the TF-IDF formula or use variations like:</p><ul><li><p><strong>Logarithmic normalization</strong>: Apply a logarithmic transformation to the TF values to reduce the impact of very frequent words.</p></li><li><p><strong>Smoothing</strong>: Add a small constant to the IDFs to prevent zero divisions and improve robustness.</p></li></ul><p>In conclusion, TF-IDF is a versatile technique for measuring the importance of words in a document, taking into account both their frequency and rarity across all documents. Its applications are diverse, ranging from search engines and text classification to information retrieval and summarization.</p>]]></content:encoded></item><item><title><![CDATA[Building Data Apps with Streamlit]]></title><description><![CDATA[Streamlit is an open-source Python library that makes it incredibly easy to create interactive web applications for data exploration and learning. Instead of wrestling with HTML, CSS, or JavaScript, we can build beautiful and functional apps using just Python.]]></description><link>https://anujyadav.substack.com/p/building-data-apps-with-streamlit</link><guid isPermaLink="false">https://anujyadav.substack.com/p/building-data-apps-with-streamlit</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sun, 04 May 2025 02:28:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uYdy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Streamlit is an open-source Python library that makes it incredibly easy to create interactive web applications for data exploration and learning.<sup> </sup> Instead of wrestling with HTML, CSS, or JavaScript, we can build beautiful and functional apps using just Python.</p><p>It is simple to get started and provides rapid development support via a wide range of widgets (buttons, sliders, dropdowns, etc.). Being a Python library it plays nicely with popular Python data science libraries like Pandas, NumPy, Matplotlib, Seaborn, Plotly, and scikit-learn.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Setting up the environment</h2><ul><li><p>Create a new project directory and switch to it on a command-line/terminal</p></li><li><p>Create a virtual environment </p><pre><code>pip -m venv venv</code></pre></li><li><p>Activate the virtual environment</p><pre><code>.\venv\Scripts\activate </code></pre></li></ul><h2>Hello from Streamlit</h2><ul><li><p>Install the <code>streamlit</code> dependency.</p><pre><code>pip install streamlit</code></pre></li><li><p>Create a new Python file (e.g., <code>main.py</code>) and add a title and introductory text</p><pre><code>import streamlit as st

st.title("Hello from Streamlit!")
st.write("A simple Streamlit app to explore a dataset.")</code></pre></li><li><p>To run the app, use the command below (assuming your filename is <code>main.py</code>)</p><pre><code>streamlit run main.py</code></pre></li><li><p>This will automatically open a new tab in your web browser displaying your app. You should see the title and the text you added.</p></li></ul><h2>Working with data</h2><p>Let's load a sample dataset. We'll use a simple CSV file. For this example, let's create a dummy CSV named <code>sample_data.csv</code> with the following content:</p><pre><code>Name,Age,City,Score
Alice,25,New York,85
Bob,30,London,92
Charlie,22,Paris,78
David,35,Tokyo,95
Eve,28,New York,88</code></pre><p>Let's load and display this data in our app:</p><pre><code>import streamlit as st
import pandas as pd

st.title("Hello from Streamlit!")
st.write("A simple Streamlit app to explore a dataset.")

# Load the data
try:
    df = pd.read_csv("sample_data.csv")
except FileNotFoundError:
    st.error("Make sure 'sample_data.csv' is in the same directory.")
    st.stop()

# Display the raw data
st.subheader("Raw Data")
st.dataframe(df) # Or st.table(df) for a static table</code></pre><p>The app will show "Raw Data" section with the DataFrame displayed. <code>st.dataframe()</code> provides an interactive table with sorting and searching capabilities. <code>st.table()</code> displays a static table.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uYdy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uYdy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 424w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 848w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 1272w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uYdy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png" width="766" height="488" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00a31970-89c8-4742-828d-eb8163e5d934_766x488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:488,&quot;width&quot;:766,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40532,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/162788458?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uYdy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 424w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 848w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 1272w, https://substackcdn.com/image/fetch/$s_!uYdy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a31970-89c8-4742-828d-eb8163e5d934_766x488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Adding interactivity with widgets</h2><p>Let's add some widgets to allow users to interact with the data.</p><ul><li><p><strong>Checkbox:</strong> To show/hide the raw data.</p></li><li><p><strong>Selectbox:</strong> To filter data based on a column.</p></li><li><p><strong>Slider:</strong> To filter data based on a numerical range.</p></li></ul><pre><code>import streamlit as st
import pandas as pd

st.title("Hello from Streamlit!")
st.write("A simple Streamlit app to explore a dataset.")

# Load the data
try:
    df = pd.read_csv("sample_data.csv")
except FileNotFoundError:
    st.error("Make sure 'sample_data.csv' is in the same directory.")
    st.stop()

# Show/hide raw data
if st.checkbox("Show raw data"):
    st.subheader("Raw Data")
    st.dataframe(df)

# Filter by city
cities = df['City'].unique()
selected_city = st.selectbox("Filter by City:", ["All"] + list(cities))
if selected_city != "All":
    df_filtered_city = df[df['City'] == selected_city]
    st.subheader(f"Data for {selected_city}")
    st.dataframe(df_filtered_city)

# Filter by age
min_age = int(df['Age'].min())
max_age = int(df['Age'].max())
age_range = st.slider("Filter by Age:", min_age, max_age, (min_age, max_age))
df_filtered_age = df[(df['Age'] &gt;= age_range[0]) &amp; (df['Age'] &lt;= age_range[1])]
st.subheader(f"Data for Ages {age_range[0]} to {age_range[1]}")
st.dataframe(df_filtered_age)</code></pre><p>Now, the app will have checkboxes, a dropdown to select cities, and a slider to filter by age, dynamically updating the displayed data.</p><h2>Visualizing Data</h2><p>Let's create some visualizations to gain insights from the data. We'll use Matplotlib and Seaborn for this.</p><pre><code>import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

st.title("Hello from Streamlit!")
st.write("A simple Streamlit app to explore a dataset.")

# Load the data
try:
    df = pd.read_csv("sample_data.csv")
except FileNotFoundError:
    st.error("Make sure 'sample_data.csv' is in the same directory.")
    st.stop()

# Show/hide raw data
if st.checkbox("Show raw data"):
    st.subheader("Raw Data")
    st.dataframe(df)

# Filter by city
cities = df['City'].unique()
selected_city = st.selectbox("Filter by City:", ["All"] + list(cities))
df_filtered = df[df['City'] == selected_city] if selected_city != "All" else df

# Filter by age
min_age = int(df_filtered['Age'].min())
max_age = int(df_filtered['Age'].max())
age_range = st.slider("Filter by Age:", min_age, max_age, (min_age, max_age))
df_filtered = df_filtered[(df_filtered['Age'] &gt;= age_range[0]) &amp; (df_filtered['Age'] &lt;= age_range[1])]

st.subheader("Filtered Data")
st.dataframe(df_filtered)

# --- Visualizations ---
st.subheader("Data Visualizations")

# Score distribution
st.subheader("Score Distribution")
fig_score, ax_score = plt.subplots()
sns.histplot(df_filtered['Score'], kde=True, ax=ax_score)
st.pyplot(fig_score)

# Age vs. Score
st.subheader("Age vs. Score")
fig_scatter, ax_scatter = plt.subplots()
sns.scatterplot(x='Age', y='Score', data=df_filtered, hue='City', ax=ax_scatter)
st.pyplot(fig_scatter)

# City counts
st.subheader("Number of People per City")
city_counts = df_filtered['City'].value_counts()
st.bar_chart(city_counts)</code></pre><p>Here, we've added two visualizations: a histogram showing the distribution of scores and a scatter plot showing the relationship between age and score, colored by city. We use <code>st.pyplot()</code> to display Matplotlib figures and <code>st.bar_chart()</code> for a simple bar chart directly within Streamlit.</p><h2>Layout and Organization</h2><p>Streamlit provides several ways to organize your app's layout for better readability and structure.</p><ul><li><p><code>st.sidebar</code><strong>:</strong> To create a sidebar for filters and controls.</p></li><li><p><code>st.columns</code><strong>:</strong> To arrange elements in multiple columns.</p></li><li><p><code>st.expander</code><strong>:</strong> To hide content that users can expand if needed.</p></li></ul><p>Let's incorporate a sidebar for our filters:</p><pre><code>import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

st.title("Hello from Streamlit!")
st.write("A simple Streamlit app to explore a dataset.")

# Load the data
try:
    df = pd.read_csv("sample_data.csv")
except FileNotFoundError:
    st.error("Make sure 'sample_data.csv' is in the same directory.")
    st.stop()

# --- Sidebar for Filters ---
st.sidebar.header("Filter Data")

# Filter by city in the sidebar
cities = df['City'].unique()
selected_city = st.sidebar.selectbox("Filter by City:", ["All"] + list(cities))
df_filtered = df[df['City'] == selected_city] if selected_city != "All" else df

# Filter by age in the sidebar
min_age = int(df_filtered['Age'].min())
max_age = int(df_filtered['Age'].max())
age_range = st.sidebar.slider("Filter by Age:", min_age, max_age, (min_age, max_age))
df_filtered = df_filtered[(df_filtered['Age'] &gt;= age_range[0]) &amp; (df_filtered['Age'] &lt;= age_range[1])]

# Show/hide raw data in the main area
if st.checkbox("Show raw data"):
    st.subheader("Raw Data")
    st.dataframe(df)

# --- Main Area ---
st.subheader("Filtered Data")
st.dataframe(df_filtered)

# --- Visualizations ---
st.subheader("Data Visualizations")

# Score distribution
st.subheader("Score Distribution")
fig_score, ax_score = plt.subplots()
sns.histplot(df_filtered['Score'], kde=True, ax=ax_score)
st.pyplot(fig_score)

# Age vs. Score
st.subheader("Age vs. Score")
fig_scatter, ax_scatter = plt.subplots()
sns.scatterplot(x='Age', y='Score', data=df_filtered, hue='City', ax=ax_scatter)
st.pyplot(fig_scatter)

# City counts
st.subheader("Number of People per City")
city_counts = df_filtered['City'].value_counts()
st.bar_chart(city_counts)</code></pre><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;5b5cdca4-064a-47d9-a24f-26aaa992d4d6&quot;,&quot;duration&quot;:null}"></div><div class="pullquote"><p>Found this valuable? Show some love by liking and sharing it! Your support helps us reach more people who could benefit. &#10084;&#65039;&#128226;</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Unleash Your Personal AI Sidekick: Setting Up a Local LLM Assistant]]></title><description><![CDATA[Have you ever wished for a brilliant assistant who's always ready to help, never sleeps, and has access to vast knowledge?]]></description><link>https://anujyadav.substack.com/p/unleash-your-personal-ai-sidekick</link><guid isPermaLink="false">https://anujyadav.substack.com/p/unleash-your-personal-ai-sidekick</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Wed, 16 Apr 2025 04:22:20 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0a17adc6-0367-42ce-973f-817f55f6962a_1037x275.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Have you ever wished for a brilliant assistant who's always ready to help, never sleeps, and has access to vast knowledge? Well, buckle up because we're about to turn that sci-fi dream into reality! In this post, we'll guide you through setting up your Large Language Model (LLM) assistant on your local machine. Get ready to supercharge your productivity and tap into the power of AI, all without relying on cloud services or sharing your data.</p><p>Large Language Models have taken the tech world by storm, offering unprecedented natural language processing capabilities. While cloud-based services like ChatGPT have garnered much attention, there's a growing movement toward local, privacy-focused AI assistants. By running an LLM on your hardware, you maintain full control over your data and can customize the model to your specific needs.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Let's dive into how you can set up this game-changing tool using Ollama, an open-source framework for running LLMs locally.</p><h2>Section 1: Preparing Your Battlestation</h2><p>Before we summon our AI assistant, let's make sure our computer is ready for the task.</p><h3>1.1 System Requirements</h3><ul><li><p>A decent CPU (4+ cores recommended)</p></li><li><p>At least 8GB of RAM (16GB+ for larger models)</p></li><li><p>10GB+ free disk space</p></li><li><p>Windows 10/11, macOS, or Linux</p></li></ul><h3>1.2 Setting Up the Environment</h3><p>We'll use Ollama as our LLM framework. There are official installers for all major platforms. We will be using Windows for this post.</p><p>The very first step is to download <a href="https://ollama.com/download">Ollama</a>.</p><blockquote><p><strong>Pro Tip:</strong> To keep your C: drive clutter-free, we'll set up environment variables to store models elsewhere.</p></blockquote><pre><code># PowerShell command to set OLLAMA_MODELS environment variable
[System.Environment]::SetEnvironmentVariable('OLLAMA_MODELS', 'D:\OllamaModels', [System.EnvironmentVariableTarget]::User)</code></pre><h2>Section 2: Summoning Your AI Assistant</h2><p>Now that our battle station is primed, let's bring our AI assistant to life!</p><h3>2.1 Installing Ollama</h3><ol><li><p>Install WSL2 if you're on Windows</p></li><li><p>Download and install Ollama following the instructions at ollama.ai</p></li><li><p>Open a terminal and run:</p></li></ol><pre><code>ollama run llama2</code></pre><p>This command downloads and runs the Llama 2 model, a powerful general-purpose LLM.</p><h3>2.2 Choosing Your AI Companion</h3><p>Ollama supports various models. Here are some popular choices:</p><ul><li><p>Llama 2: Great all-rounder</p></li><li><p>CodeLlama: Specialized for programming tasks</p></li><li><p>Mistral: Excellent performance-to-size ratio</p></li></ul><p>To run a specific model:</p><pre><code>ollama run codellama</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FKpi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FKpi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 424w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 848w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 1272w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FKpi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png" width="850" height="217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:217,&quot;width&quot;:850,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/161433644?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FKpi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 424w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 848w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 1272w, https://substackcdn.com/image/fetch/$s_!FKpi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f340e47-2c2e-4cd3-a760-cbdf1be8817d_850x217.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2>Section 3: Interfacing with Your AI Assistant</h2><p>Now that your AI is alive and kicking, let's give it a proper interface!</p><h3>3.1 Command-Line Interface</h3><p>The simplest way to interact is through the terminal:</p><pre><code>ollama run llama2 "Explain quantum computing in simple terms"</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nvvP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nvvP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 424w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 848w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 1272w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nvvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png" width="1037" height="275" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:275,&quot;width&quot;:1037,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46055,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/161433644?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nvvP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 424w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 848w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 1272w, https://substackcdn.com/image/fetch/$s_!nvvP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094bfc2e-3b4b-4140-a4a2-98ed53134647_1037x275.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>3.2 Graphical User Interfaces</h3><p>For a more user-friendly experience, consider these open-source UIs:</p><ul><li><p>OpenWebUI: A sleek, web-based interface</p></li><li><p>LM Studio: A desktop application with model management</p></li></ul><h2>Section 4: Customizing Your AI Sidekick</h2><p>One of the biggest advantages of running your own LLM is customization. Let's explore how to tailor your AI assistant to your needs.</p><h3>4.1 Fine-tuning</h3><p>Fine-tuning allows you to specialize the model for specific tasks or domains. While advanced, it can significantly improve performance for your use cases.</p><h3>4.2 Prompt Engineering</h3><p>Craft effective prompts to guide your AI's responses. For example:</p><pre><code>You are an expert software engineer. Analyze the following code for potential bugs and suggest improvements:

[Your code here]</code></pre><h2>Conclusion: Your AI Adventure Begins</h2><p>Congratulations! You've just set up your personal AI assistant powered by cutting-edge LLM technology. With this powerful tool at your fingertips, you're ready to tackle complex problems, generate creative ideas, and boost your productivity like never before.</p><p>Remember, the key to getting the most out of your AI sidekick lies in experimentation and learning. Try different models, craft various prompts, and discover how this technology can best serve your unique needs.</p><p>Now, go forth and conquer with your new AI companion by your side! Share your experiences and creative uses in the comments below.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Create an AI powered React Native App - Part 1]]></title><description><![CDATA[We will be using Expo to create our React Native App.]]></description><link>https://anujyadav.substack.com/p/create-an-ai-powered-react-native</link><guid isPermaLink="false">https://anujyadav.substack.com/p/create-an-ai-powered-react-native</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Thu, 27 Mar 2025 03:04:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Slor!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We will be using Expo to create our React Native App.</p><h2>Pre-requisites</h2><ul><li><p>Install Expo Orbit - https://expo.dev/orbit</p></li><li><p>Install Expo Go - https://expo.dev/go</p></li><li><p>Install Android Studio - You may also want to set up SDK and emulator, etc.</p></li><li><p>Ensure Node.js is installed</p></li><li><p>Ensure that VS Code, the favorite lightweight powerhouse, is installed.</p></li><li><p>Follow - https://docs.expo.dev/tutorial/create-your-first-app/</p></li></ul><p></p><blockquote><p>Orbit will be very helpful for quickly starting your development environment.</p><p><strong>Note</strong>: Windows defender or some anti-virus softwares may flag it as a virus. You would need to provide expception (if you trust the source). </p></blockquote><h2>Create an Expo React Native App</h2><pre><code>npx create-expo-app@latest &lt;App Name&gt;</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XCFA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XCFA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 424w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 848w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 1272w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XCFA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png" width="661" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:661,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:35531,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/159908593?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XCFA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 424w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 848w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 1272w, https://substackcdn.com/image/fetch/$s_!XCFA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F213f0e4c-5b30-498f-9f87-3a0d12356f56_661x262.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The project is ready, and below, we have the output.</p><pre><code>
&#9989; Your project is ready!

To run your project, navigate to the directory and run one of the following npm commands.

- cd ReactBlast
- npm run android
- npm run ios # you need to use macOS to build the iOS project - use the Expo app if you need to do iOS development without a Mac
- npm run web</code></pre><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Slor!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Slor!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 424w, https://substackcdn.com/image/fetch/$s_!Slor!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 848w, https://substackcdn.com/image/fetch/$s_!Slor!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 1272w, https://substackcdn.com/image/fetch/$s_!Slor!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Slor!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png" width="666" height="520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:666,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37188,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/159908593?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Slor!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 424w, https://substackcdn.com/image/fetch/$s_!Slor!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 848w, https://substackcdn.com/image/fetch/$s_!Slor!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 1272w, https://substackcdn.com/image/fetch/$s_!Slor!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8fe2798-7732-40ed-b66b-b895ae1fe589_666x520.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><blockquote><p>If it does not work out of the box for you, open Expo Orbit (installed earlier) and create and launch the emulator.</p></blockquote><p>It worked like a charm for me on the first try.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e-Gy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e-Gy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 424w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 848w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e-Gy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png" width="783" height="844" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:844,&quot;width&quot;:783,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:170538,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/159908593?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e-Gy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 424w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 848w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 1272w, https://substackcdn.com/image/fetch/$s_!e-Gy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F81421d29-3f93-408d-a606-8fd1b45509f3_783x844.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Re-Starting your development environment</h2><p>The easiest way is to start Expo Orbit. On Windows, it will be under hidden icons. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FWU5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FWU5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 424w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 848w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 1272w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FWU5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png" width="536" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:536,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36267,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/159908593?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FWU5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 424w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 848w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 1272w, https://substackcdn.com/image/fetch/$s_!FWU5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5451071a-d1a1-4a4a-9314-5bd00222698c_536x415.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The elephant in the room: The Android Emulator</h2><p>I had Android SDK set up, and emulators had already been created on my system; thus, it worked well.</p><p>You can see the &#8220;Emulators&#8221; in the image above. I hope you have already installed <a href="https://developer.android.com/studio">Android Studio</a>.</p><h3><strong>Setting up the Android Emulator:</strong></h3><ul><li><p><strong>Open Android Studio:</strong> Launch Android Studio.</p></li><li><p><strong>Open AVD Manager:</strong> Go to Tools &gt; AVD Manager.</p></li><li><p><strong>Create Virtual Device:</strong> Click on "Create Virtual Device".</p></li><li><p><strong>Select Device Definition:</strong> Choose a device definition (e.g., Pixel 4).</p></li><li><p><strong>Select System Image:</strong> Choose an Android OS version (e.g., the latest API level) and download the system image if necessary.</p></li><li><p><strong>Configure and Finish:</strong> Configure the emulator settings (e.g., name, orientation) and click "Finish".</p></li></ul><h3>Using Expo Go</h3><p>You can also use Expo Go. Just browse to the <a href="https://expo.dev/go?sdkVersion=52&amp;platform=android&amp;device=false">URL</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bjTw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bjTw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 424w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 848w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 1272w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bjTw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png" width="1000" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:100044,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://anujyadav.substack.com/i/159908593?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!bjTw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 424w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 848w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 1272w, https://substackcdn.com/image/fetch/$s_!bjTw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dbaf783-d4c0-4577-94b4-af7d5b108328_1000x518.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can use &#8220;Install with orbit&#8221; option.</p><h2>Last but not least: Re-Running the app</h2><p>I found out that using the below command is the most friendly way of running your app.</p><pre><code>npx expo start</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JPRY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JPRY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 424w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 848w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 1272w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JPRY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png" width="881" height="711" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:711,&quot;width&quot;:881,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JPRY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 424w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 848w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 1272w, https://substackcdn.com/image/fetch/$s_!JPRY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78fa9796-39be-4c72-9f26-3dca411b9834_881x711.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Java: Observer Pattern]]></title><description><![CDATA[The Observer Pattern is a behavioral design pattern used in Java to establish a one-to-many dependency between objects.]]></description><link>https://anujyadav.substack.com/p/java-observer-pattern</link><guid isPermaLink="false">https://anujyadav.substack.com/p/java-observer-pattern</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Fri, 27 Dec 2024 09:56:39 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The <strong>Observer Pattern</strong> is a behavioral design pattern used in Java to establish a one-to-many dependency between objects. When one object (called the <strong>Subject</strong>) changes its state, all dependent objects (<strong>Observers</strong>) are notified and updated automatically.</p><h3><strong>When to Use Observer Pattern</strong></h3><ul><li><p>When changes to one object require notifying and updating multiple dependent objects.</p></li><li><p>In event-driven architectures or pub-sub systems.</p></li><li><p>To implement features like notifications, UI updates, or reactive programming.</p></li></ul><h3>Basic Observer Pattern Implementation</h3><h4>Define the Observer Interface</h4><pre><code>package com.agiledcoders.observer;

public interface Observer&lt;T&gt; {
    void update(T data);
}</code></pre><p>The <code>update</code> method will receive the updates from the Subject.</p><h3>Define the Subject Interface</h3><pre><code>package com.agiledcoders.observer;

public interface Subject&lt;T&gt; {
    void register(Observer&lt;T&gt; observer);
    void unregister(Observer&lt;T&gt; observer);
    void notify(T data);
}
</code></pre><ul><li><p>The <code>register</code> method allows one to register an Observer.</p></li><li><p>The <code>unregister</code> method allows one to unregister an Observer.</p></li><li><p>The notify method notifies the Observer(s) </p><p></p></li></ul><h3>Implement Subject</h3><pre><code>package com.agiledcoders.observer;

import java.util.ArrayList;
import java.util.List;
import java.util.Objects;

public class SubjectImpl implements Subject&lt;String&gt; {
    private final List&lt;Observer&lt;String&gt;&gt; observers = new ArrayList&lt;&gt;();

    @Override
    public void register(Observer&lt;String&gt; observer) {
        if (Objects.isNull(observer)) {
            throw new IllegalArgumentException("Observer must be non-null");
        }
        this.observers.add(observer);
    }

    @Override
    public void unregister(Observer&lt;String&gt; observer) {
        this.observers.remove(observer);
    }

    @Override
    public void notify(String data) {
        this.observers.forEach(stringObserver -&gt; stringObserver.update(data));
    }
}</code></pre><p></p><h3>Implement Observer</h3><pre><code>package com.agiledcoders.observer;

public class ObserverImpl implements Observer&lt;String&gt; {
    private static int id;

    public ObserverImpl() {
        id = id + 1;
    }

    @Override
    public void update(String data) {
        System.out.println("----------------------------------------");
        System.out.println("Observer Id: " + id + " received update.");
        System.out.println(data);
        System.out.println("----------------------------------------");
    }
}</code></pre><p></p><h3>Create a Runner to test the pattern</h3><pre><code>package com.agiledcoders.observer;

public class Runner {
    public static void main(String[] args) {
        Observer&lt;String&gt; first = new ObserverImpl();
        Observer&lt;String&gt; second = new ObserverImpl();

        Subject&lt;String&gt; subject = new SubjectImpl();

        subject.register(first);
        subject.register(second);

        subject.notify("First iteration");

        subject.unregister(first);
        subject.notify("Second iteration");
    }
}</code></pre><p></p><h3>Output</h3><pre><code>----------------------------------------
Observer Id: 2 received update.
First iteration
----------------------------------------
----------------------------------------
Observer Id: 2 received update.
First iteration
----------------------------------------
----------------------------------------
Observer Id: 2 received update.
Second iteration
----------------------------------------</code></pre>]]></content:encoded></item><item><title><![CDATA[Add JUnit to Maven Project]]></title><description><![CDATA[Modify the pom.xml file of your Maven project to include the JUnit dependency.]]></description><link>https://anujyadav.substack.com/p/add-junit-to-maven-project</link><guid isPermaLink="false">https://anujyadav.substack.com/p/add-junit-to-maven-project</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Fri, 27 Dec 2024 08:05:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UbWO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Modify the <code>pom.xml</code> file of your Maven project to include the JUnit dependency. Include the appropriate dependency snippet depending on the version of JUnit you want to use.</p><h3><strong>JUnit 5 (JUnit Jupiter)</strong> (Recommended)</h3><pre><code>&lt;dependencies&gt;
    &lt;!-- JUnit 5 Dependency --&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;org.junit.jupiter&lt;/groupId&gt;
        &lt;artifactId&gt;junit-jupiter&lt;/artifactId&gt;
        &lt;version&gt;5.10.0&lt;/version&gt; 
        &lt;scope&gt;test&lt;/scope&gt;
    &lt;/dependency&gt;
&lt;/dependencies&gt;</code></pre><p></p><h3><strong>JUnit 4</strong> (For backward compatibility)</h3><pre><code>&lt;dependencies&gt;
    &lt;!-- JUnit 4 Dependency --&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;junit&lt;/groupId&gt;
        &lt;artifactId&gt;junit&lt;/artifactId&gt;
        &lt;version&gt;4.13.2&lt;/version&gt; 
        &lt;scope&gt;test&lt;/scope&gt;
    &lt;/dependency&gt;
&lt;/dependencies&gt;</code></pre><p></p><p>The project structure and POM look as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UbWO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UbWO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 424w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 848w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 1272w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UbWO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png" width="1361" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:1361,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:98184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UbWO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 424w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 848w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 1272w, https://substackcdn.com/image/fetch/$s_!UbWO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a72e946-805c-4fe0-9fcd-bc90e07d93ac_1361x640.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>Verify JUnit integration</h3><p>JUnit tests should be placed in the <code>src/test/java</code> directory of your Maven project.</p><ol><li><p>Navigate to <code>src/test/java</code>.</p></li><li><p>Create a new test class if it is not already present (e.g., <code>AppTest.java</code>).</p></li></ol><h4>Example for JUnit 5</h4><pre><code>package org.example;

import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;

public class AppTest {
    @Test
    void testMultiplication() {
        int result = 2 * 3;
        assertEquals(6, result, "Result of multiplying 2 &amp; 3 should be 6");
    }
}</code></pre><h4>Execute the test</h4><pre><code>mvn test</code></pre><p>If all goes well, you will see output similar to the following:</p><pre><code>[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.example.AppTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.040 s -- in org.example.AppTest
[INFO] 
[INFO] Results:
[INFO]
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  2.055 s
[INFO] Finished at: 2024-12-27T13:27:44+05:30
[INFO] ------------------------------------------------------------------------</code></pre><p></p>]]></content:encoded></item><item><title><![CDATA[Issues with Clock-Bound Wait and why the general understanding is flawed]]></title><description><![CDATA[Refer Clock Bound Wait]]></description><link>https://anujyadav.substack.com/p/issues-with-clock-bound-wait-and</link><guid isPermaLink="false">https://anujyadav.substack.com/p/issues-with-clock-bound-wait-and</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Tue, 22 Oct 2024 16:12:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ovrI!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c2fe191-6aea-48f4-83fd-21f61c257114_1200x1200.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Refer</h2><p><a href="https://martinfowler.com/articles/patterns-of-distributed-systems/clock-bound-wait.html">Clock Bound Wait</a></p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>Clock-Bound Wait</strong></h2><div class="pullquote"><p>Wait to cover the uncertainty in time across cluster nodes before reading and writing values so that values can be correctly ordered across cluster nodes.</p></div><p>A <strong>Clock-Bound Wait</strong> involves waiting for a certain time period or until a specific clock condition is met before an operation proceeds. <strong>This approach relies on the </strong><em><strong>assumption</strong></em><strong> that all nodes in the </strong><em><strong>distributed system</strong></em><strong> </strong><em><strong>can synchronize time accurately enough</strong></em><strong> to </strong><em><strong>make coordinated decisions</strong></em><strong> based on </strong><em><strong>local clocks</strong></em>.</p><p></p><h2>The Claimed Solution</h2><div class="pullquote"><p>While reading or writing, cluster nodes wait until the clock values on every node in the cluster are guaranteed to be above the timestamp assigned to the value.</p><p>If the difference between clocks is very small, write requests can wait without adding a great deal of overhead. As an example, assume the maximum clock offset across cluster nodes is 10 ms. (This means that, at any given point in time, the slowest clock in the cluster is lagging behind the fastest one by at most 10 ms.) To guarantee that every other cluster node has its clock past time <em>t</em>, the cluster node that handles any write operation will have to wait until <em>t</em> + 10 ms before storing the value.</p></div><p></p><h2>Challenges</h2><ul><li><p><strong>Clock Drift</strong>:</p><ul><li><p>Distributed nodes often experience <strong>clock drift</strong>, where their clocks slightly deviate from each other. Even with technologies like <strong>NTP</strong> (Network Time Protocol), perfect synchronization is challenging. This makes any clock-dependent coordination unreliable, especially over time.</p></li></ul></li><li><p><strong>Lack of Global Clock</strong>:</p><ul><li><p>Distributed systems inherently do not share a single, reliable global clock. Each node operates with its own clock, leading to <strong>time uncertainty</strong> across the system. Inconsistent clocks can cause actions to occur out of order or fail when they should succeed.</p></li></ul></li><li><p><strong>Latency and Network Delays</strong>:</p><ul><li><p>Network latency can add unpredictable delays to message transmission between nodes. If one node sets a clock-bound wait and other nodes have lagging or faster clocks, it could result in decisions being made prematurely or too late, leading to inconsistencies or timeouts.</p></li></ul></li><li><p><strong>Imprecise Failures</strong>:</p><ul><li><p>A node might wait for a certain time to pass based on its local clock, but if it doesn't accurately reflect the real time across the network, this could result in faulty decisions&#8212;especially in consensus algorithms (like Paxos or Raft) or in transactional systems.</p></li></ul><p></p></li></ul><h2>Practical </h2><ul><li><p>Assuming we have achieved some mechanism to calculate a &#8220;Correct Waiting Time&#8221; does that mean we can use this mechanism for a write-heavy system? No, as it would result in high latency and low-throughput.</p></li><li><p>What mechanism would provide us with the correct waiting time? Will not we have a very chatty system?</p><p></p></li></ul><h2>Belief</h2><p>There seems to be a belief that Count-Bound Wait is good for consistency. </p><p></p><h2>Reality</h2><p>Contrary to the general belief, Count-Bound systems lean towards eventual consistency and availability. </p><p>A system that heavily relies on clock-bound waits while <strong>boasting about consistency</strong> is indeed delusional. Why? Because:</p><ol><li><p><strong>Clock-bound waits sacrifice durability</strong>: If the system proceeds after a timeout without ensuring that all replicas have acknowledged the write, it can lead to inconsistencies.</p></li><li><p><strong>No guarantee of write completeness</strong>: Clock-bound waits can't ensure that the write is durably stored across all required nodes.</p></li><li><p><strong>Delayed reconciliation</strong>: Systems that rely on clock-bound waits will need to perform some form of <strong>eventual reconciliation</strong> to catch up with missed writes, introducing complexity.</p></li></ol><h2><strong>Conclusion</strong></h2><p><strong>Short-circuiting</strong> is a more general term that encompasses techniques like timeouts, buffer limits, and similar mechanisms. These are techniques designed to optimize for <strong>high availability</strong> (HA) and responsiveness when full acknowledgment from all nodes isn't possible or takes too long. AP systems (Availability-Partition Tolerant systems) are designed around <strong>eventual consistency</strong>, systems with a replication factor of <strong>N + 1/2</strong> use <strong>timeouts</strong> to achieve this, and calling this a "clock-bound wait" is more of a semantic choice.</p><p>In other words, a <strong>clock-bound wait</strong> isn&#8217;t a revolutionary concept in itself&#8212;it&#8217;s just another form of short-circuiting, where the system "decides to wait" for a period of time before deciding to act based on the responses it receives.</p><p>In distributed systems, <strong>Clock-Bound Wait</strong> is typically not recommended due to the lack of reliable global clocks, potential clock drift, and network latencies. Instead, <strong>Versioning</strong> techniques like vector clocks, Lamport timestamps, or monotonically increasing versions are much more robust. They handle data consistency, conflict resolution, and causal relationships in an asynchronous and scalable manner, without relying on precise synchronization of clocks across distributed nodes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Docker: Understanding Networking, port sharing and DNS using Golang ]]></title><description><![CDATA[Multiple service instances]]></description><link>https://anujyadav.substack.com/p/docker-understanding-networking-port</link><guid isPermaLink="false">https://anujyadav.substack.com/p/docker-understanding-networking-port</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sat, 12 Oct 2024 05:17:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/21fc11d3-2e8d-458a-a0f8-5c601942c844_619x777.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Multiple service instances </h2><p>To create multiple instances of a service, we usually think about port usage of these instances. Generally, you must use different ports if you try to do the same on your local machine.</p><p>For example, try to start multiple instances of this simple Go server below:</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><pre><code>package main

import (
&#9;"fmt"
&#9;"log"
&#9;"net"
&#9;"net/http"
)

func GetLocalIP() string {
&#9;addrs, err := net.InterfaceAddrs()
&#9;if err != nil {
&#9;&#9;return ""
&#9;}
&#9;for _, address := range addrs {
&#9;&#9;// check the address type and if it is not a loopback the display it
&#9;&#9;if ipnet, ok := address.(*net.IPNet); ok &amp;&amp; !ipnet.IP.IsLoopback() {
&#9;&#9;&#9;if ipnet.IP.To4() != nil {
&#9;&#9;&#9;&#9;return ipnet.IP.String()
&#9;&#9;&#9;}
&#9;&#9;}
&#9;}
&#9;return ""
}

func helloHandler(w http.ResponseWriter, r *http.Request) {
&#9;fmt.Fprintf(w, "Hello, World from %s!\n", GetLocalIP())
}

func main() {
&#9;http.HandleFunc("/", helloHandler)
&#9;log.Println("Starting Hello World server...")
&#9;log.Fatal(http.ListenAndServe(":8888", nil))
}</code></pre><blockquote><p>The server returns a &#8220;Hello &#8230;&#8221; message with local IP information.</p></blockquote><p></p><h3>Running first instance</h3><p>Below we are starting our Go server.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W304!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W304!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 424w, https://substackcdn.com/image/fetch/$s_!W304!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 848w, https://substackcdn.com/image/fetch/$s_!W304!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 1272w, https://substackcdn.com/image/fetch/$s_!W304!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W304!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png" width="421" height="56.502631578947366" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:51,&quot;width&quot;:380,&quot;resizeWidth&quot;:421,&quot;bytes&quot;:5756,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W304!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 424w, https://substackcdn.com/image/fetch/$s_!W304!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 848w, https://substackcdn.com/image/fetch/$s_!W304!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 1272w, https://substackcdn.com/image/fetch/$s_!W304!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb734ee9a-8cf1-4368-8631-e8a8b83f5af9_380x51.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Check the process information bind to port 8888</h3><pre><code>$port = 8888
Get-Process -Id (Get-NetTCPConnection -LocalPort $port).OwningProcess</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HMGj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HMGj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 424w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 848w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 1272w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HMGj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png" width="797" height="162" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:162,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19036,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HMGj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 424w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 848w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 1272w, https://substackcdn.com/image/fetch/$s_!HMGj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec5f1bde-b79c-41ed-94c9-28599eed73b2_797x162.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p></p><h3>Try starting a new instance </h3><pre><code>go run server.go</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f5wG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f5wG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 424w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 848w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 1272w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f5wG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png" width="384" height="83" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:83,&quot;width&quot;:384,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6937,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f5wG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 424w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 848w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 1272w, https://substackcdn.com/image/fetch/$s_!f5wG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46b5ecca-9800-4c24-ab51-025d54f10002_384x83.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><blockquote><p>Notice that the second time the terminal is not blocked as the process was already running.</p></blockquote><p></p><h3>Verify the process information</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dOPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dOPP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 424w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 848w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 1272w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dOPP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png" width="803" height="166" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:166,&quot;width&quot;:803,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19248,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dOPP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 424w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 848w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 1272w, https://substackcdn.com/image/fetch/$s_!dOPP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8332cf24-7e3a-4fe9-bd73-278e93a057f0_803x166.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><blockquote><p>The process information is the same.</p></blockquote><p></p><div class="pullquote"><p>So, two processes did not start and no port sharing happened.</p></div><h3>Change the port and try again</h3><p>Change the port to 9999 and rerun the server.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kcYz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kcYz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 424w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 848w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 1272w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kcYz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png" width="397" height="76" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a798a09a-147b-48af-afc1-508644179e5c_397x76.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:76,&quot;width&quot;:397,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5972,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kcYz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 424w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 848w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 1272w, https://substackcdn.com/image/fetch/$s_!kcYz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa798a09a-147b-48af-afc1-508644179e5c_397x76.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><blockquote><p>The terminal is blocked</p></blockquote><h3>Verify the process information for port 9999</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pj-O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pj-O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 424w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 848w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 1272w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pj-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png" width="814" height="157" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:157,&quot;width&quot;:814,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19266,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pj-O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 424w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 848w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 1272w, https://substackcdn.com/image/fetch/$s_!Pj-O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F458acbf9-0cf2-436b-a554-a8cb07539a2d_814x157.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><blockquote><p>The process ID has changed.</p></blockquote><h3>Verify that the process for port 8888 is the same</h3><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VmM5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VmM5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 424w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 848w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 1272w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VmM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png" width="806" height="154" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:154,&quot;width&quot;:806,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19139,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VmM5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 424w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 848w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 1272w, https://substackcdn.com/image/fetch/$s_!VmM5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6364b3c-89db-4a96-aa54-c3233b638cf4_806x154.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="pullquote"><p>This port change makes it difficult to simulate server clusters and distributed systems locally and fiddle with them.</p></div><h2>Enters Docker</h2><blockquote><p>Docker&#8217;s internal networking allows services with the same port to coexist.</p></blockquote><p>Docker Compose provides an enhanced setup to simulate clusters and fiddle with the distributed environments. As an engineer, this is gold. </p><h3>Service Isolation</h3><ul><li><p>Each service runs inside its own <strong>Docker container</strong>. </p></li><li><p>They all expose the same port <code>(say 8888 or 9999)</code></p></li><li><p>They are isolated from each other because each container has its internal network stack.</p></li></ul><h3>Internal Docker Network</h3><ul><li><p>Docker Compose automatically creates a <strong>virtual network</strong> for all the services defined in the <code>docker-compose.yml</code> file. </p></li><li><p>Inside this network, each service can be reached by its <strong>service name</strong>  (which you can define in the docker-compose file).</p></li><li><p>They can run on the same port without conflict because the network namespace of each container is separate.</p></li></ul><h3>Port Binding</h3><ul><li><p>We don't need to bind each service to a port on the host machine. </p></li><li><p>Any other service composed in the same stack can communicate with other services in the same stack using Docker's internal network. Keep reading for example.</p><p></p></li></ul><h3>DNS</h3><p>Docker Compose provides an automatic <strong>DNS resolution</strong> mechanism for services within the same network. </p><h4><strong>Automatic Network Creation</strong> </h4><ul><li><p>When you run <code>docker-compose up</code>, Docker creates an isolated network (with its internal DNS) where all the services can communicate with each other. </p></li><li><p>So, even if multiple containers use the same port internally (like a port <code>8080</code>), they can do so because they each have their IP addresses within the Docker network.</p></li><li><p>Docker automatically resolves service names (for example, <code>server1</code>, <code>server2</code>, <code>etc.</code>) to their corresponding container IPs. Thus, each service name acts like a hostname.</p></li></ul><h4><strong>Service Discovery via Names</strong>:</h4><ul><li><p>In the <code>docker-compose.yml</code> file, the names you give to services (like <code>server1</code>, <code>server2</code> etc.) are used as <strong>DNS names</strong> inside the Docker network. </p></li><li><p>Thus, any service inside the composed container can use it like a URI/URL.</p></li></ul><h4><strong>Internal DNS Handling</strong></h4><ul><li><p>Docker handles the DNS resolution automatically. </p></li><li><p>When an internal service tries to reach http://server1:8888, Docker&#8217;s internal DNS system will resolve <code>server1</code> to the correct container&#8217;s IP address within the virtual network, ensuring traffic gets routed correctly.</p><p></p></li></ul><h2>Dockerfile for the Go server</h2><pre><code># Official Go image
FROM golang:1.22.5-alpine

# Set current working directory (inside the container)
WORKDIR /app

# Copy go.mod and go.sum files (if you have any)
COPY go.mod go.sum ./

# Download dependencies
RUN go mod download

# Copy files
COPY ./src/goose/server.go .

# Build the Go app
RUN go build -o server .

# Expose port 8888
EXPOSE 8888

# Command to run the executable
CMD ["./server"]</code></pre><h3>PORT Sharing</h3><p>To practically experience PORT sharing we will create a Reverse Proxy LB and use Named Docker service to distribute the traffic.</p><h2>Reverse Proxy Load Balancer in Go</h2><p>Here, we implement a basic reverse proxy that distributes incoming traffic to a list of backend services using a round-robin strategy. The LB will listen at the port <code>9090</code>. </p><pre><code>package main

import (
&#9;"log"
&#9;"net/http"
&#9;"net/http/httputil"
&#9;"net/url"
&#9;"sync/atomic"
)

var backends = []string{
&#9;"http://server1:8888",
&#9;"http://server2:8888",
&#9;"http://server3:8888",
}

var counter uint64

func getNextBackend() string {
&#9;next := atomic.AddUint64(&amp;counter, 1)
&#9;return backends[(next-1)%uint64(len(backends))]
}

func reverseProxyHandler(w http.ResponseWriter, r *http.Request) {
&#9;backendURL := getNextBackend()
&#9;url, err := url.Parse(backendURL)
&#9;if err != nil {
&#9;&#9;http.Error(w, "Bad Gateway", http.StatusBadGateway)
&#9;&#9;return
&#9;}

&#9;proxy := httputil.NewSingleHostReverseProxy(url)
&#9;proxy.ServeHTTP(w, r)
}

func main() {
&#9;http.HandleFunc("/", reverseProxyHandler)
&#9;log.Println("Load Balancer running on port 9090...")
&#9;log.Fatal(http.ListenAndServe(":9090", nil))
}</code></pre><h3>Docker file for load balancer - lb.Dockerfile</h3><pre><code><code># Official Go image
FROM golang:1.20-alpine

# Set current working directory (inside the container)
WORKDIR /app

# (Standard) Copy go.mod and go.sum files (if you have any)
COPY go.mod go.sum ./

# (Standard) Download dependencies
RUN go mod download

# Copy files
COPY ./src/lb/lb.go .

# Build the Go app
RUN go build -o lb .

# Expose port 9090
EXPOSE 9090

# Command to run the executable
CMD ["./lb"]</code></code></pre><h3><strong>Docker Compose Setup</strong></h3><p>We'll create a <code>docker-compose.yml</code> file to orchestrate the load balancer and the two backend services.</p><pre><code><code>version: '3.8'

services:
    lb:
        build:
            context: .
            dockerfile: lb.Dockerfile
        ports:
            - '9090:9090'
        depends_on:
            - server1
            - server2
            - server3

    server1:
        build:
            context: .
            dockerfile: server.Dockerfile
        container_name: server1
        expose:
            - '8888'

    server2:
        build:
            context: .
            dockerfile: server.Dockerfile
        container_name: server2
        expose:
            - '8888'

    server3:
        build:
            context: .
            dockerfile: server.Dockerfile
        container_name: server3
        expose:
            - '8888'

networks:
    default:</code></code></pre><h3>Directory Structure</h3><p>Below is my directory structure. <em>I have multiple sample projects, so everything is under a &#8220;src&#8221; folder</em>. The &#8220;&#8230;/&#8221; represents other code pieces and &#8220;goose&#8221; is the module directory where the &#8220;server.go&#8221; file contains the code for our server.</p><pre><code>.
&#9500;&#9472;&#9472; src/
&#9474;   &#9500;&#9472;&#9472; ...
&#9474;   &#9500;&#9472;&#9472; goose/
&#9474;   &#9474;   &#9492;&#9472;&#9472; server.go
&#9474;   &#9492;&#9472;&#9472; lb/
&#9474;       &#9492;&#9472;&#9472; lb.go
&#9500;&#9472;&#9472; go.mod
&#9500;&#9472;&#9472; go.sum
&#9500;&#9472;&#9472; server.Dockerfile
&#9500;&#9472;&#9472; lb.Dockerfile
&#9492;&#9472;&#9472; docker-compose.yml</code></pre><h3>Build and Run the Services</h3><pre><code>docker-compose up --build</code></pre><p>Open your browser or use <code>curl</code> to access the load balancer on <code>localhost:9090.</code></p><pre><code>curl http://localhost:9090</code></pre><p>You should see responses alternating between the three backend services, like:</p><pre><code>Hello, World from 172.30.0.4!
Hello, World from 172.30.0.2!
Hello, World from 172.30.0.3!</code></pre><div class="pullquote"><p>I hope this post gives clarity on Docker Compose and Networking. Container isolation and Go HTTP and Go Reverse Proxy.<br></p><p>Found this valuable? Show some love by liking and sharing it! Your support helps us reach more people who could benefit. &#10084;&#65039;&#128226;</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Real Time Analytics and MongoDB - The need for ScyllaDB]]></title><description><![CDATA[In previous posts, we discussed the challenges and solutions to design a real-time analytical system with MongoDB.]]></description><link>https://anujyadav.substack.com/p/real-time-analytics-and-mongodb-the-59a</link><guid isPermaLink="false">https://anujyadav.substack.com/p/real-time-analytics-and-mongodb-the-59a</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sun, 29 Sep 2024 06:09:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2UWv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In previous posts, we discussed the challenges and solutions to design a real-time analytical system with MongoDB. </p><ul><li><p><a href="https://anujyadav.substack.com/p/real-time-analytics-and-mongodb">Real-Time Analytics and MongoDB</a></p></li><li><p><a href="https://anujyadav.substack.com/p/real-time-analytics-and-mongodb-the">Real-Time Analytics and MongoDB - The data processing</a></p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2UWv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2UWv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 424w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 848w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 1272w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2UWv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp" width="1095" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1095,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46050,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2UWv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 424w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 848w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 1272w, https://substackcdn.com/image/fetch/$s_!2UWv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faa524f7d-66f9-4595-b016-df8705a0f77a_1095x728.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Questions</h2><blockquote><p>Is ScyllaDB required for mid-size system? This is because Kafka consumers with fanout consumer proxy pattern can consume data fast enough and thus would keep the cost cheap. </p><p>How large scale systems can support near-real-time analytics?</p></blockquote><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Exploration</h2><h3>&#128204; <strong>Why Scylla Might Not Be Necessary for Mid-Size Systems</strong></h3><p>For mid-size systems, introducing ScyllaDB might be overkill. Kafka itself, with its <strong>fanout consumer proxy pattern</strong>, can be enough to handle a considerable amount of data processing. Here's why:</p><ol><li><p><strong>Kafka&#8217;s Strength in Mid-Scale Systems</strong>:</p><ul><li><p>Kafka&#8217;s ability to <strong>fan out</strong> consumers allows you to scale <strong>horizontally</strong> with multiple processing applications reading from the same Kafka topic.</p></li><li><p>Each consumer can be responsible for specific transformations or pushing data into downstream databases, effectively handling <strong>real-time streaming</strong> at a mid-scale level without introducing an extra data storage layer like ScyllaDB.</p></li></ul></li><li><p><strong>Processing Directly from Kafka</strong>:</p><ul><li><p>With <strong>Kafka Streams</strong> or <strong>Flink</strong>, you can perform most of the data processing directly from Kafka. This eliminates the need for another storage layer if your data processing requirements are simple enough and you don&#8217;t have to deal with <strong>extreme write throughput</strong>.</p></li><li><p>Since Kafka retains data for a configured time period (via topic retention policies), it can even handle some level of <strong>short-term persistence</strong>, making Scylla unnecessary unless you need high-write throughput over long periods.</p></li></ul></li></ol><p></p><h3>&#128204; <strong>For Large-Scale Systems: The Kafka &#8594; Scylla &#8594; Kafka Chain</strong></h3><p>For <strong>large-scale systems</strong> where <strong>near-real-time processing</strong> is required, you would likely need a <strong>CDC</strong> (Change Data Capture) mechanism to keep data <strong>continuously flowing</strong> between ScyllaDB and Kafka. Here's a more detailed view of how this could work:</p><ol><li><p><strong>Kafka Ingestion</strong>:<br>Kafka continues to serve as the <strong>data ingestion</strong> layer, capturing data from various upstream sources. This data could be <strong>raw events</strong>, logs, or user actions.</p></li><li><p><strong>Initial Processing and Writing to ScyllaDB</strong>:<br>Once the data is ingested, it can be transformed or enriched via Kafka Streams or a similar real-time processing framework.</p><ul><li><p><strong>Processed or partially processed</strong> data is then written to <strong>ScyllaDB</strong>, which acts as the primary <strong>persistent storage</strong> for the raw data.</p></li></ul></li><li><p><strong>Need for Continuous Data Flow (Near-Real-Time)</strong>:<br>In large-scale systems, <strong>periodic batch processing</strong> might not be enough, and you may need a more continuous data flow.</p><ul><li><p>For <strong>low-latency data processing</strong>, you would need a <strong>CDC</strong> mechanism on ScyllaDB.</p></li><li><p>ScyllaDB doesn&#8217;t emit events like Kafka, but with <strong>CDC</strong>, you can track changes and push them back into Kafka for further real-time processing.</p></li></ul></li><li><p><strong>Kafka &#8594; Scylla &#8594; Kafka Flow</strong>:<br>The <strong>Kafka &#8594; Scylla &#8594; Kafka chain</strong> could work as follows:</p><ul><li><p>Data is first ingested and minimally processed in Kafka.</p></li><li><p>The processed data is stored in <strong>ScyllaDB</strong> for persistence and scalability.</p></li><li><p>A <strong>CDC</strong> mechanism captures any updates, changes, or new data in ScyllaDB.</p></li><li><p>Those CDC events are sent back to Kafka for <strong>real-time processing</strong>.</p></li><li><p>Kafka Streams or Flink can pick up these events and perform <strong>near-real-time processing</strong>, such as joining with other streams, aggregating, or further transforming the data.</p></li></ul></li><li><p><strong>Final Storage in a Read-Optimized Layer</strong>:<br>After the real-time processing, the final <strong>transformed data</strong> is written to the <strong>read layer</strong> (such as MongoDB, Elasticsearch, or ClickHouse) where it&#8217;s optimized for queries, analytics, and reporting.</p></li></ol><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Ku8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Ku8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 424w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 848w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Ku8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png" width="956" height="690" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/79e2230d-d248-467f-b30d-4f380da59cea_956x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:690,&quot;width&quot;:956,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:249594,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Ku8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 424w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 848w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ku8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F79e2230d-d248-467f-b30d-4f380da59cea_956x690.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Batch and Real-Time Processing</figcaption></figure></div><p></p><h3>&#128204; <strong>When Would You Need This Setup?</strong></h3><p>You would need this <strong>Kafka &#8594; Scylla &#8594; Kafka chain</strong> in cases where:</p><ul><li><p><strong>High write throughput</strong> needs to be sustained, but you also require <strong>real-time or near-real-time analytics</strong>.</p></li><li><p>You want to separate <strong>raw data ingestion</strong> (Kafka and ScyllaDB) from <strong>complex processing logic</strong> (Kafka Streams, Flink) and <strong>query-optimized storage</strong> (MongoDB, Elasticsearch, etc.).</p></li><li><p><strong>The latency</strong> between raw data ingestion and queryable data needs to be minimized, and batch processing isn&#8217;t sufficient.</p></li></ul><p></p><h3>&#128204; <strong>Summary:</strong></h3><ul><li><p><strong>For mid-size systems</strong>, Kafka and its fanout consumers can be sufficient without ScyllaDB, as it can handle streaming and simple transformations efficiently.</p></li><li><p><strong>For large-scale systems</strong>, ScyllaDB is useful for handling <strong>write-heavy workloads</strong>, but if you need <strong>continuous processing</strong>, the <strong>CDC mechanism</strong> is essential to avoid periodic batch jobs.</p></li><li><p>The <strong>Kafka &#8594; Scylla &#8594; Kafka chain</strong> can provide real-time data flow for <strong>low-latency processing</strong>, with Kafka capturing ingestion events, Scylla storing raw data, and Kafka being used again for CDC-based continuous data processing.</p></li></ul><p></p><div class="pullquote"><p>Found this valuable? Show some love by liking and sharing it! Your support helps us reach more people who could benefit. &#10084;&#65039;&#128226;</p></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Real Time Analytics and MongoDB - The data processing]]></title><description><![CDATA[The last post discussed the high-level architecture of an analytical system that used MongoDB as a read layer.]]></description><link>https://anujyadav.substack.com/p/real-time-analytics-and-mongodb-the</link><guid isPermaLink="false">https://anujyadav.substack.com/p/real-time-analytics-and-mongodb-the</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sun, 29 Sep 2024 05:26:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yJRy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a href="https://anujyadav.substack.com/p/real-time-analytics-and-mongodb">The last post</a> discussed the high-level architecture of an analytical system that used MongoDB as a read layer. The hybrid architecture that we proposed, consists of Kafka, ScyllaDB, Processing Layer (Flink, Spark, Custom MicroService, etc.), and MongoDB.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yJRy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yJRy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 424w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 848w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 1272w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yJRy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp" width="1095" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1095,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46050,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yJRy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 424w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 848w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 1272w, https://substackcdn.com/image/fetch/$s_!yJRy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb799db9e-7d71-4e5e-b52c-29e2331931ae_1095x728.webp 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Question</h2><blockquote><p>How and why processing layer is reading data from ScyllaDB? The Kafka stream can transform or process data, or Kafka directly can be consumed by the processing layer. Also, Scylladb does not emit events like Kafka for processors to consume. So how does all this work?</p></blockquote><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Exploration</h2><p>Let&#8217;s break down how the <strong>processing layer</strong> works in this architecture and clarify the roles of Kafka and ScyllaDB in the system.</p><h3>&#128204; <strong>Kafka&#8217;s Role in Data Transformation</strong></h3><p><strong>Kafka Streams</strong> (or similar processing frameworks) can transform data as it flows through Kafka. Kafka Streams allow for <strong>real-time data transformation</strong>, aggregation, and even windowed computations before the data is written to downstream storage (like ScyllaDB). However, Kafka typically excels at:</p><ol><li><p><strong>Ingestion</strong>: Capturing and buffering <strong>real-time data</strong> from producers.</p></li><li><p><strong>Stream Processing</strong>: Kafka Streams or other processing frameworks (like <strong>Apache Flink</strong>, and <strong>Spark Streaming</strong>) can transform the data before pushing it to the next layer.</p></li><li><p><strong>Fan-Out</strong>: Kafka&#8217;s design allows multiple consumers (or processing layers) to subscribe to the same topic, so the data can be processed in parallel by different applications.</p></li></ol><p><strong>But</strong>&#8212;Kafka doesn&#8217;t <strong>store data long-term</strong>, and it isn&#8217;t optimized for <strong>querying or analytics</strong>. That&#8217;s where <strong>ScyllaDB</strong> and <strong>MongoDB</strong> come in.</p><p></p><h3>&#128204; <strong>ScyllaDB&#8217;s Role in the Architecture</strong></h3><p>ScyllaDB isn&#8217;t designed for <strong>event-driven processing</strong> like Kafka, so it doesn&#8217;t emit events to processors. Instead, ScyllaDB is used for <strong>high-throughput data storage</strong> of raw, unprocessed data (or minimally processed data). Here&#8217;s how that works in the context of the architecture:</p><ol><li><p><strong>Data Written from Kafka</strong>:<br>Kafka Streams (or similar) <strong>pre-processes the raw data</strong> to some extent and then <strong>writes</strong> the partially transformed data to ScyllaDB. ScyllaDB acts as a <strong>time-series store</strong> or <strong>persistent store</strong> for fast writes, enabling storage of huge amounts of incoming data.</p></li><li><p><strong>Data Querying for Further Processing</strong>:<br>The <strong>processing layer</strong> (e.g., <strong>Apache Spark</strong>, <strong>Flink</strong>, or a custom service) can read the <strong>raw data</strong> stored in ScyllaDB and perform additional processing, cleaning, enrichment, or aggregation. This step is usually needed if more complex business logic, transformations, or computations are required before sending data to MongoDB.</p><p>&#128313; <strong>Querying from ScyllaDB</strong>:<br>The processing layer can query ScyllaDB directly based on <strong>primary keys</strong>, <strong>clustering keys</strong>, or time-series filters. Since ScyllaDB is a highly scalable NoSQL database, you can retrieve and process large datasets very efficiently.</p><p>&#128313; <strong>Batch or Near Real-Time Processing</strong>:<br>Although ScyllaDB doesn&#8217;t emit events, the processing layer can poll or query it periodically in <strong>batch mode</strong> (e.g., every few seconds or minutes) or use <strong>change data capture (CDC)</strong> if supported to detect new data in real time.</p></li><li><p><strong>Further Processing in the Layer</strong>:<br>Once the raw data is fetched from ScyllaDB, the processing layer can:</p><ul><li><p><strong>Aggregate</strong>, <strong>join</strong>, or <strong>clean the data</strong> (if needed).</p></li><li><p>Perform complex <strong>business logic</strong> transformations.</p></li><li><p>Enrich the data by combining it with data from other sources.</p></li></ul></li><li><p><strong>Processed Data Sent to MongoDB</strong>:<br>After processing, the results are then sent to <strong>MongoDB</strong>. MongoDB is the <strong>final storage</strong> layer where the fully processed, enriched, and query-optimized data is stored. This is where real-time querying and analytics happen.</p></li></ol><p></p><h3>&#128204; <strong>How Kafka, ScyllaDB, and Processing Layers Interact</strong>:</h3><ol><li><p><strong>Kafka Stream Processing</strong>:<br>Kafka streams can handle <strong>some level of processing</strong>&#8212;filtering, basic transformation, and enrichment&#8212;but only to a certain point. Kafka primarily focuses on data ingestion and ensuring data flows reliably.</p></li><li><p><strong>ScyllaDB for Low-Latency Storage</strong>:<br>ScyllaDB handles the raw data that needs to be stored <strong>quickly</strong> and <strong>at scale</strong>. Its purpose is to allow your architecture to ingest huge amounts of data without worrying about losing data or overwhelming MongoDB, which is more suited for querying and analytics.</p></li><li><p><strong>Processing Layer</strong>:<br>The <strong>processing layer</strong> acts as the bridge between the <strong>raw write-heavy storage</strong> (ScyllaDB) and the <strong>read-optimized storage</strong> (MongoDB). It reads data from ScyllaDB, processes it (for example, aggregating hourly/daily statistics, and calculating metrics), and then sends the <strong>processed data</strong> to MongoDB for querying.</p><p>In this setup, the processing layer can be implemented using:</p><ul><li><p><strong>Apache Spark</strong> or <strong>Flink</strong> for distributed data processing.</p></li><li><p><strong>Custom microservices</strong> that pull data from ScyllaDB, process it, and push to MongoDB.</p></li></ul></li></ol><p></p><h3>&#128204; <strong>Benefits of This Architecture</strong>:</h3><ol><li><p><strong>Separation of Concerns</strong>:<br>Kafka handles ingestion and streaming, ScyllaDB handles write-heavy storage, and MongoDB handles read-heavy querying and analytics.</p></li><li><p><strong>Optimized for Scale</strong>:<br>ScyllaDB&#8217;s design allows it to scale linearly and handle large amounts of data with low-latency writes, while MongoDB is optimized for real-time analytics at scale.</p></li><li><p><strong>Flexibility in Processing</strong>:<br>The processing layer can handle complex transformations and enrichment before the data reaches MongoDB, making it highly adaptable to different business needs.</p></li><li><p><strong>Efficient Resource Usage</strong>:<br>By offloading high-throughput writes to ScyllaDB, you avoid the costs associated with MongoDB&#8217;s sharded clusters, which would be far more expensive for write-heavy workloads.</p></li></ol><p></p><div class="pullquote"><p>Found this valuable? Show some love by liking and sharing it! Your support helps us reach more people who could benefit. &#10084;&#65039;&#128226;</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Real-Time Analytics and MongoDB]]></title><description><![CDATA[There seems to be a popular belief among engineers that Real-Time Analytics is a good use case for MongoDB.]]></description><link>https://anujyadav.substack.com/p/real-time-analytics-and-mongodb</link><guid isPermaLink="false">https://anujyadav.substack.com/p/real-time-analytics-and-mongodb</guid><dc:creator><![CDATA[Anuj Yadav]]></dc:creator><pubDate>Sun, 29 Sep 2024 04:21:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Xb7-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There seems to be a popular belief among engineers that Real-Time Analytics is a good use case for MongoDB. While there are legitimate reasons to believe so, the overall usage needs a deeper understanding.</p><h2>The Problem</h2><p>An analytical system needs to ingest data at a high rate. This requires the storage system to have low write latency and high throughput.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xb7-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xb7-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 424w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 848w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 1272w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xb7-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png" width="963" height="642" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/12eb428c-6a38-4aac-8876-e18804be8692_963x642.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:642,&quot;width&quot;:963,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xb7-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 424w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 848w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 1272w, https://substackcdn.com/image/fetch/$s_!Xb7-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F12eb428c-6a38-4aac-8876-e18804be8692_963x642.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">High-Level Data Ingestion Visualization</figcaption></figure></div><p>MongoDB is not always the best choice for <strong>write-heavy systems</strong> compared to other NoSQL databases like <strong>Cassandra</strong> or <strong>ScyllaDB</strong>. Here are some challenges:</p><p>&#128204; <strong>Challenges of MongoDB in Write-Heavy Systems</strong></p><ol><li><p><strong>Write Performance</strong>: MongoDB&#8217;s default write operations are <strong>not as optimized for high-throughput writes</strong> as some other NoSQL databases. Databases like <strong>Cassandra</strong> and <strong>ScyllaDB</strong> offer <strong>better performance for write-heavy workloads</strong> due to their design, which allows for low-latency writes and linear scalability across nodes.</p></li><li><p><strong>Clustering Costs</strong>: Managing a sharded MongoDB cluster can become costly. Sharding requires not only <strong>data</strong> but also <strong>indexes to be duplicated</strong> across shards to ensure proper performance and consistency during queries. Each shard holds its own copy of the index, increasing storage and resource consumption.</p></li><li><p><strong>Critical Shard Key Selection</strong>: Choosing the right shard key is essential in MongoDB. A poorly chosen shard key can lead to <strong>hot spots</strong>, uneven distribution of data, or poor performance as some nodes may be overloaded while others remain underutilized.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wcyn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wcyn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 424w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 848w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 1272w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wcyn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png" width="762" height="603" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:603,&quot;width&quot;:762,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:130436,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wcyn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 424w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 848w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 1272w, https://substackcdn.com/image/fetch/$s_!Wcyn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3333128c-821b-466c-bc94-f0e4c64eb4a0_762x603.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>&#128204; <strong>Could MongoDB be a Good Fit for Read-Heavy Analytics?</strong></p><p>Yes, using <strong>MongoDB for storing processed data</strong> can work well in a <strong>read-heavy real-time analytics system</strong>, but pairing it with a <strong>write-optimized system</strong> can balance the workload.</p><p>&#128313; <strong>Kafka or Scylla for Write-Heavy Systems</strong>:<br>Kafka or Scylla could handle <strong>high-throughput writes</strong> and act as a buffer, ingesting data in real-time. <strong>Kafka</strong> would be useful for stream processing, and <strong>ScyllaDB</strong> could store low-latency, write-heavy data due to its high performance in write operations.</p><p>&#128313; <strong>MongoDB for Processed Data Storage</strong>:<br>Once the data is processed (aggregated, cleaned, transformed), you can store it in MongoDB for <strong>real-time querying and analysis</strong>. MongoDB excels in <strong>complex queries</strong> and can efficiently handle real-time analytics workloads when the focus shifts from writing to reading large volumes of data.</p><h2>The Proposal</h2><p>&#128073; Let&#8217;s dive deeper into a <strong>hybrid architecture</strong> combining <strong>Kafka, ScyllaDB, and MongoDB</strong> for handling high-throughput, low-latency writes while optimizing for real-time analytics.</p><h3>&#128204; <strong>Proposed Architecture Overview</strong></h3><p>The architecture focuses on <strong>decoupling the write-intensive</strong> and <strong>read-intensive</strong> components to optimize both data ingestion and real-time analytics.</p><h3><strong>1. Data Ingestion Layer with Kafka</strong></h3><p>At the heart of this architecture is <strong>Apache Kafka</strong> or a similar <strong>message queue system</strong> that can act as a real-time data ingestion layer. Kafka ensures that incoming data is reliably captured and distributed to downstream components.</p><p>&#128313; <strong>Kafka&#8217;s Role</strong>: Kafka can buffer and stream real-time data from various sources (e.g., user activity, sensor data, logs) while allowing multiple consumers to process this data in parallel. Its ability to handle <strong>millions of writes per second</strong> makes it an ideal fit for high-throughput applications.</p><p>&#128313; <strong>Advantages</strong>:</p><ul><li><p><strong>Low Latency</strong>: Kafka provides real-time message delivery with low overhead.</p></li><li><p><strong>Fault Tolerance</strong>: Kafka is highly reliable and can ensure that no data is lost even when downstream systems are under load.</p></li><li><p><strong>Horizontal Scalability</strong>: Kafka brokers can be scaled horizontally to match increasing data volumes.</p></li></ul><p>&#128313; <strong>Example Use Case</strong>:<br>For a <strong>real-time IoT system</strong>, Kafka can collect data from thousands of sensors streaming in real-time, such as temperature, humidity, and device statuses.</p><h3><strong>2. Write-Heavy Data Store with ScyllaDB</strong></h3><p>Once Kafka ingests data, the <strong>raw, unprocessed data</strong> needs to be stored in a <strong>write-optimized NoSQL system</strong>. This is where <strong>ScyllaDB</strong> (or <strong>Cassandra</strong>) comes in. ScyllaDB is a powerful NoSQL database designed for <strong>low-latency, high-write performance</strong>, perfect for <strong>write-heavy workloads</strong>.</p><p>&#128313; <strong>Why ScyllaDB (or Cassandra)</strong>:</p><ul><li><p><strong>Low-Latency Writes</strong>: ScyllaDB&#8217;s design supports <strong>high throughput</strong> with low-latency writes, perfect for rapidly incoming data from Kafka.</p></li><li><p><strong>Linear Scalability</strong>: ScyllaDB is known for its linear scaling, meaning that adding more nodes will proportionally improve the performance of writes and reads.</p></li><li><p><strong>Automatic Data Replication</strong>: ScyllaDB automatically handles replication and fault tolerance, ensuring your data is safe and consistent across multiple nodes.</p></li></ul><p>&#128313; <strong>Architecture</strong>:</p><ul><li><p><strong>Kafka Consumers</strong>: A set of consumers reads messages from Kafka and writes them to ScyllaDB in real-time.</p></li><li><p><strong>Time-Series Data</strong>: ScyllaDB excels at time-series data, where each incoming data point (e.g., from IoT sensors) is written as a new record.</p></li></ul><p>&#128313; <strong>Example Use Case</strong>:</p><ul><li><p>In a <strong>real-time analytics system</strong> for a ride-hailing service, ScyllaDB can store raw trip data (e.g., location updates, timestamps) from millions of drivers and riders. It&#8217;s write-intensive, but it doesn&#8217;t require real-time querying just yet.</p><p></p></li></ul><h3><strong>3. Real-Time Analytics with MongoDB</strong></h3><p>Once data is stored in ScyllaDB, it can be <strong>processed</strong> (aggregated, cleaned, transformed) and then passed to <strong>MongoDB</strong>, which acts as the <strong>read-optimized store</strong> for real-time analytics. This is where MongoDB&#8217;s strength lies&#8212;complex, flexible querying and serving real-time insights.</p><p>&#128313; <strong>Why MongoDB</strong>:</p><ul><li><p><strong>Schema Flexibility</strong>: MongoDB allows developers to store processed data in a flexible schema, making it ideal for evolving business needs.</p></li><li><p><strong>Rich Querying</strong>: MongoDB&#8217;s aggregation framework is powerful, enabling real-time insights by executing complex queries, filtering, and grouping operations.</p></li><li><p><strong>Scaling Reads</strong>: MongoDB can efficiently scale horizontally to handle heavy read loads, which is critical for real-time analytics.</p></li><li><p><strong>Indexes</strong>: MongoDB&#8217;s powerful indexing system supports fast querying. However, <strong>index duplication in sharded clusters</strong> can increase resource usage, so careful consideration of <strong>index design</strong> and <strong>shard keys</strong> is essential.</p></li></ul><p>&#128313; <strong>Sharding and Indexing</strong>:</p><ul><li><p>MongoDB uses <strong>sharding</strong> to distribute large datasets across multiple nodes. But each shard holds its own index, which duplicates storage and indexing overhead.</p></li><li><p>The <strong>shard key</strong> is critical&#8212;choosing the right shard key affects both performance and distribution of data across nodes.</p></li></ul><p>&#128313; <strong>Example Use Case</strong>:</p><ul><li><p>For the same ride-hailing service, MongoDB can store processed trip data such as total distance, fare, driver ratings, and customer feedback. The <strong>aggregation</strong> of this data could provide <strong>real-time insights</strong> such as average trip length, busiest times of day, or heatmaps of ride demand.</p></li></ul><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FPFP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FPFP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 424w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 848w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 1272w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FPFP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png" width="1095" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1095,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:321149,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FPFP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 424w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 848w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 1272w, https://substackcdn.com/image/fetch/$s_!FPFP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e14d793-5c9c-4041-bcd0-ce019b3674cd_1095x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h3>&#128204; <strong>The Role of Each Component</strong></h3><p><strong>1. Kafka (Data Ingestion)</strong>:<br>Kafka ingests real-time streaming data (e.g., logs, sensor data, app activity) and acts as the intermediary between the write-heavy store and processing layer. Kafka scales horizontally to handle high throughput and ensures no data is lost under heavy load.</p><p><strong>2. ScyllaDB (Write-Optimized Store)</strong>:<br>ScyllaDB is the system of choice for handling <strong>low-latency, write-heavy workloads</strong>. It acts as a scalable time-series store for unprocessed data that needs to be written at high speed. While MongoDB would struggle with this volume of writes, ScyllaDB excels here.</p><p><strong>3. MongoDB (Read-Optimized Analytics)</strong>:<br>Once the data is processed, MongoDB provides a <strong>read-optimized storage layer</strong> that supports flexible schema changes and complex queries, ideal for real-time analytics. It scales horizontally to handle massive datasets for fast, real-time querying.</p><h3>&#128204; <strong>Architecture Flow Example</strong>:</h3><ol><li><p><strong>Data Ingestion</strong>: Real-time data from IoT devices, apps, or logs flows into Kafka.</p></li><li><p><strong>Low-Latency Writes</strong>: Kafka streams the raw data into ScyllaDB, where it is stored and partitioned based on time, location, or other business-specific criteria.</p></li><li><p><strong>Processing Layer</strong>: A <strong>data processing layer</strong> reads data from ScyllaDB, processes it (e.g., aggregates, cleans, transforms), and passes the results to MongoDB.</p></li><li><p><strong>Real-Time Analytics</strong>: MongoDB stores the processed data and provides real-time insights via its rich querying and aggregation framework. Dashboards, analytics platforms, and apps pull data from MongoDB to display insights.</p><p></p></li></ol><h3>&#128204; <strong>Advantages of This Hybrid Setup</strong>:</h3><p>&#128313; <strong>Decoupling Write and Read Workloads</strong>:<br>By splitting <strong>write-heavy</strong> and <strong>read-heavy</strong> operations between ScyllaDB and MongoDB, the architecture ensures that each database handles the workload it&#8217;s optimized for.</p><p>&#128313; <strong>Kafka&#8217;s Buffering Capabilities</strong>:<br>Kafka acts as a buffer, making the system more <strong>resilient to spikes</strong> in traffic, ensuring that the databases can ingest data at their own pace.</p><p>&#128313; <strong>Scalability</strong>:<br>Both <strong>ScyllaDB</strong> and <strong>MongoDB</strong> are designed for horizontal scalability, making it easier to scale the system as data volumes grow.</p><p>&#128313; <strong>Optimized Costs</strong>:<br>MongoDB clustering is costly, especially when handling high write volumes. By offloading writes to ScyllaDB, you reduce MongoDB&#8217;s storage and processing burden, which can significantly cut down on operational costs.</p><h3>&#128204; <strong>Challenges to Consider</strong>:</h3><ol><li><p><strong>Data Synchronization</strong>:<br>You&#8217;ll need an efficient processing pipeline between <strong>ScyllaDB</strong> and <strong>MongoDB</strong> to ensure that data is processed and synced in near real-time.</p></li><li><p><strong>Shard Key Design</strong>:<br>Choosing the right <strong>shard key</strong> in MongoDB is critical. Poor shard key selection can lead to uneven data distribution or &#8220;hotspots,&#8221; which can degrade performance.</p></li><li><p><strong>Operational Complexity</strong>:<br>Managing <strong>multiple database systems</strong> increases the operational overhead. Ensuring seamless integration and monitoring between Kafka, ScyllaDB, and MongoDB will require careful planning and tooling.</p><p></p></li></ol><h3>&#128204; <strong>Conclusion</strong></h3><p>This hybrid architecture provides a balance between <strong>low-latency writes</strong> (with ScyllaDB) and <strong>real-time analytics</strong> (with MongoDB), powered by <strong>Kafka</strong> for reliable data ingestion. This approach maximizes performance while managing the cost and complexity associated with MongoDB clustering.</p><p></p><blockquote><p>&#128073; For mid-size systems introducing ScyllaDB might be overkill. Kafka itself, with its fanout consumer proxy pattern, can be enough to handle a considerable amount of data processing.</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://anujyadav.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Engineering Mojo! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>