Discover toon_format: A YAML-like serialization library for LLMs

View organization page for Python Indore (PyIndore)

44 followers

5mo Edited

🧠 Just tried out a really cool Python library — toon_format — and it’s a hidden gem for anyone working with LLMs or large data payloads. It’s a compact, human-readable serialization format that reduces context size by 30–60% vs JSON, while staying super easy to read and use. What makes it awesome: • YAML-like indentation • CSV-style tabular arrays • Minimal syntax, array validation • Python 3.8+ and battle-tested • Fully compatible with the official TOON spec ⚙️ Install it: pip install toon_format (or uv add toon_format) Quick example 👇 from toon_format import encode, decode encode({"name": "Alice", "age": 30}) # name: Alice # age: 30 encode([{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]) # [2,]{id,name}: # 1,Alice # 2,Bob We have been using it to trim LLM context payloads — super efficient and still human-friendly. 🚀 If you deal with JSON or token limits, give toon_format a try ! I have shared repository link in first comment. #Python #OpenSource #LLM #Serialization #AI #Developers #MachineLearning #GenAI

1 Comment

Python Indore (PyIndore) 5mo

Toon Python Repository Link: https://lnkd.in/dkrUJYxW

To view or add a comment, sign in

More Relevant Posts

Sina Shariati
5mo
Report this post
🤖 𝐏𝐘𝐓𝐇𝐎𝐍 𝐈𝐍𝐒𝐈𝐆𝐇𝐓 𝐅𝐎𝐑 𝐀𝐈 𝐀𝐆𝐄𝐍𝐓𝐒 & 𝐓𝐄𝐗𝐓-𝐓𝐎-𝐒𝐐𝐋 𝐁𝐔𝐈𝐋𝐃𝐄𝐑𝐒 While working on a 𝐑𝐀𝐆-𝐛𝐚𝐬𝐞𝐝 𝐓𝐞𝐱𝐭-𝐭𝐨-𝐒𝐐𝐋 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐨𝐫, a subtle but powerful distinction in Python: 🔹 list() → a 𝐛𝐮𝐢𝐥𝐭-𝐢𝐧 𝐜𝐨𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐨𝐫 that actually 𝘤𝘳𝘦𝘢𝘵𝘦𝘴 a list at runtime. 🔹 List → a 𝐭𝐲𝐩𝐞 𝐡𝐢𝐧𝐭 from the typing module that 𝘥𝘦𝘴𝘤𝘳𝘪𝘣𝘦𝘴 what the list contains for tools and AI frameworks. When building 𝐀𝐈 𝐚𝐠𝐞𝐧𝐭𝐬 or 𝐋𝐚𝐧𝐠𝐂𝐡𝐚𝐢𝐧 𝐩𝐢𝐩𝐞𝐥𝐢𝐧𝐞𝐬, this difference matters. - list() controls how your data structures behave during execution. - List defines how your system’s components (like retrievers, LLMs, or SQL generators) communicate type expectations. Clear typing helps your agents validate inputs, prevent errors, and maintain consistency across multiple asynchronous nodes — especially in complex 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥-𝐚𝐮𝐠𝐦𝐞𝐧𝐭𝐞𝐝 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 (𝐑𝐀𝐆) workflows. 𝘐’𝘷𝘦 𝘢𝘵𝘵𝘢𝘤𝘩𝘦𝘥 𝘮𝘺 𝘧𝘶𝘭𝘭 𝘔𝘦𝘥𝘪𝘶𝘮 𝘱𝘰𝘴𝘵 𝘣𝘦𝘭𝘰𝘸 𝘧𝘰𝘳 𝘮𝘰𝘳𝘦 𝘥𝘦𝘵𝘢𝘪𝘭𝘴. #Python #LangChain #AI #DataEngineering #MachineLearning #TextToSQL #SoftwareDevelopment #LearningEveryDay

Understanding the Difference Between list() and List in Python medium.com
Like Comment
To view or add a comment, sign in
Abhishek Kumar
5mo
Report this post
𝐏𝐲𝐭𝐡𝐨𝐧 𝐓𝐢𝐩 𝐨𝐟 𝐭𝐡𝐞 𝐃𝐚𝐲: 𝐌𝐚𝐬𝐭𝐞𝐫𝐢𝐧𝐠 𝐟𝐢𝐥𝐭𝐞𝐫(), 𝐦𝐚𝐩(), 𝐚𝐧𝐝 𝐬𝐨𝐫𝐭𝐞𝐝() When working with Python, these three built-in functions can make your data processing cleaner, faster, and more readable. Let’s break them down 👇 ↘️ map() - Transform Data - Applies a function to every element in an iterable. Example: numbers = [1, 2, 3, 4, 5] squares = list(map(lambda x: x**2, numbers)) print(squares) Output = [1, 4, 9, 16, 25] ✅ Use when you want to modify or compute new values from existing data. ↘️ filter() - Extract What You Need - Filters elements based on a condition (function that returns True or False). Example: numbers = [1, 2, 3, 4, 5] evens = list(filter(lambda x: x % 2 == 0, numbers)) print(evens) Output = [2, 4] ✅ Use when you need to keep only specific elements that match a condition. ↘️ sorted() - Arrange Your Data - Sorts elements of an iterable (ascending by default). You can customize it using the key parameter. data = [("apple", 3), ("banana", 1), ("cherry", 2)] sorted_data = sorted(data, key=lambda x: x[1]) print(sorted_data) Output = [('banana', 1), ('cherry', 2), ('apple', 3)] ✅ Use when you need to organize your data in a specific order. 💡 In short: map() → Transform filter() → Select sorted() → Organize Mastering these three can make your Python code not just functional but elegant. #Python #CodingTips #DataScience #DataEngineering #Learning
Like Comment
To view or add a comment, sign in
Amr Salah Abd ElGhany
5mo
Report this post
Writing a for-loop in Python to process a list of data? You might be adding hours to your script's runtime without even knowing it. I see this all the time: analysts use loops for data transformations that could be done in a fraction of the time. The bottleneck isn't your computer's speed—it's how you're talking to it. The secret to faster data processing in Python is vectorization. Instead of processing each element one-by-one in a loop, vectorized operations apply a function to an entire dataset simultaneously, leveraging optimized, pre-compiled C code under the hood. Let's take a common task: calculating the square of every number in a list. The Slow Way (Loop): python import pandas as pd data = pd.Series(range(1, 1000001)) squared_list = [] for num in data: squared_list.append(num ** 2) The Fast Way (Vectorized): python import pandas as pd data = pd.Series(range(1, 1000001)) squared_list = data ** 2 The vectorized approach isn't just cleaner—it's dramatically faster. For a million rows, the loop might take ~150ms, while the vectorized operation can finish in ~2ms. That's a 98.7% reduction in processing time! This principle applies across pandas and NumPy: Use df['column'].str.upper() instead of looping with .upper() Use df['column'].apply(function) instead of a for-loop (.apply is optimized) Use NumPy's universal functions (np.log, np.sqrt) on arrays Adopting a vectorized mindset is a game-changer for efficiency. Have you ever refactored a slow loop into a vectorized operation? What was the performance boost like? Share your story below! #Python #DataAnalysis #Pandas #CodingTips #DataScience
1 Comment
Like Comment
To view or add a comment, sign in
Jaume Boguñá
6mo
Report this post
Lambda functions aren’t just for one-liners They can make your Python data workflows cleaner and faster. Here are 5 Python lambda tricks every data scientist should master: 1 → Writing concise one-off functions instead of full def blocks 2 → Using lambdas with map(), filter(), sort() for clean transformations 3 → Capturing variables in closures for pipeline convenience 4 → Combining lambdas with pandas and NumPy for inline operations 5 → Choosing when not to use lambdas (for readability & debugging) Read it here: https://lnkd.in/djGG3rfW

5 Python Lambda Tricks Every Data Scientist Should Master python.plainenglish.io
Like Comment
To view or add a comment, sign in
Srinivasan Keshav
6mo Edited
Report this post
Using a meta-meta-prompt to generate a self-documenting python script I have been playing with AI-generated code (see my previous post on this) and one thing I am keen to do is to document the process, specifically, tell the AI not only to produce code, but also document the process used to generate it, i.e., the set of prompts used to generate the final code. So, I needed to write a prompt that would tell the AI to do a task and also document the prompt itself. Instead of figuring this out on my own, I asked Claude to tell me what prompt I needed to give it! Since I was asking Claude how to generate a prompt for a task, my initial question was a meta-meta-prompt. Confused? Read the full details here: https://lnkd.in/e2aR86ep A hint: using Quarto markdown makes this task much easier, since it allows a document to contain executable code.

Using a meta-meta-prompt to create a self-documenting python script for data cleaning svr-sk818-web.cl.cam.ac.uk

1 Comment
Like Comment
To view or add a comment, sign in
Himanshu Tripathi
6mo
Report this post
🧠 Day 282: Exploring the Magic of Regular Expressions in Python (re) Ever tried to validate an email, extract numbers from text, or find patterns in data? That’s where Python’s re module comes in — your go-to Swiss Army knife for pattern matching and text manipulation. Let’s see it in action 👇 import re text = "The year is 2025." match = re.search(r'\d+', text) if match: print(f"Found Number: {match.group()}") Here, \d+ searches for one or more digits — and just like that, Python extracts 2025 from the text. 💡 Pro Tip: Use re for cleaning data, validating inputs (like emails or phone numbers), or extracting insights from unstructured text. 🎯 Challenge: Try writing a regex that validates whether a string is a proper email address — and test it with both valid and invalid examples. #Python #RegularExpressions #DataCleaning #Regex
Like Comment
To view or add a comment, sign in
Priyanka Panda
6mo
Report this post
⚡ Handling Missing Values in Python Here’s a simple breakdown of the different methods used in Python 1️⃣ Identify Missing Values df.isnull() # Shows True/False for missing values df.isnull(). sum() # Counts missing values per column You can also check the percentage of missing data: (df.isnull(). sum() / len(df)) * 100 2️⃣ Remove Missing Values If the missing values are few or not significant: df.dropna() # Removes rows with missing values df.dropna(axis=1) # Removes columns with missing values Use this when deleting data doesn’t affect the dataset’s overall quality. 3️⃣ Fill Missing Values When you can’t afford to drop data, fill the missing values instead. 🔹 Constant value df['Name']. fillna('Unknown', inplace=True) 🔹 Mean / Median / Mode (for numerical columns) df['Age']. fillna (df['Age']. mean(), inplace=True) df['Salary'].fillna (df['Salary'].median(), inplace=True) 🔹Forward or Backward Fill (for time series) df.fillna(method='ffill', inplace=True) # Forward fill df.fillna(method='bfill', inplace=True) # Backward fill 4️⃣ Advanced Imputation Using Models For large datasets or when data is missing in patterns: from sklearn.impute import SimpleImputer imputer = SimpleImputer(strategy='mean') df[['Age', 'Salary']] = imputer.fit_transform(df[['Age', 'Salary']]) Other strategies: 'median,' 'most_frequent,' and 'constant.' 🔹 Best Practices Use mean/median for numerical data. Use mode or “Unknown” for categorical data. Drop columns if more than 40–50% of the data is missing. Always analyze the pattern of missingness before deciding. #Python #DataCleaning #Pandas #DataAnalytics
4 Comments
Like Comment
To view or add a comment, sign in
Kalaiyarasi Sakthivel
5mo
Report this post
🐍 Python String Methods You Must Know 💡 Strings in Python come packed with built-in methods that make text manipulation effortless. Whether you’re cleaning data, formatting output, or analyzing text — these functions are your best friends. Here’s what they do: 🔹 capitalize() → Makes the first character uppercase. 🔹 casefold() → Converts string to lowercase (more aggressive than lower()). 🔹 count(sub) → Counts how many times a substring appears. 🔹 find(sub) → Returns the index of first occurrence (or -1 if not found). 🔹 index(sub) → Like find(), but raises an error if not found. 🔹 isalnum() → Checks if all characters are alphanumeric. 🔹 isalpha() → Checks if all characters are letters. 🔹 isascii() → Returns True if all characters are ASCII. 🔹 isdecimal(), isdigit(), isnumeric() → Check if characters are numeric (subtle differences!). 🔹 islower() → Checks if all characters are lowercase. 🔹 isidentifier() → Valid Python identifier? (e.g., variable name). 🔹 isprintable() → Are all characters printable? 💡 Pro tip: Chain these methods smartly — like text.strip().lower().replace(" ", "_") to clean and format text in a single line. #CodingTips #BuildInPublic #DeveloperJourney #CleanCode#string methods
Like Comment
To view or add a comment, sign in
Priya Jadhav
6mo
Report this post
🐍 Understanding Dictionary in Python A Dictionary in Python is a built-in data structure used to store data in key-value pairs. It’s unordered, mutable, and indexed — meaning you can easily access, update, or remove data using unique keys. 🚀 Key Points: 🔹 Keys must be unique and immutable (like strings, numbers, or tuples). 🔹 Values can be of any data type. 🔹 Enclosed in curly braces {}. 🔹 Commonly used for storing structured data like configurations, user info, or mapping relationships. 🔍 Why use Dictionaries? ✅ Fast data lookup ✅ Easy data mapping ✅ Great for JSON-like data structures 💬 In short: A Dictionary is like a real-world dictionary — you look up a word (key) to find its meaning (value). #Python #DataStructures #PythonProgramming #Coding #LearnPython #LinkedInLearning
Like Comment
To view or add a comment, sign in

44 followers

View Profile Connect

Discover toon_format: A YAML-like serialization library for LLMs

More Relevant Posts

Explore content categories