Managing nushell scripts

Nushell is a powerful programming language because it makes very good use of the unix piping mechanics.

Take this very simple one line that I made for printing terminal screen to a pdf like it's the 80s:

export def makepdf [name:string] {
  to text | enscript -B -fHelvetica10 -p - | ps2pdf - | save $"($name).pdf" -f; mupdf $"($name).pdf"
}        

I save this to a file called utils.nu and in order to use it I can invoke two different commands

use utils.nu;
echo "hello world!" | utils makepdf helloworld        

or the source command:

source utils.nu;
echo "hello world!" | makepdf helloworld        

Both perform the same things which is piping a text into a utility called enscript and then using ghostscript's ps2pdf converter, we turn it into pdf and viewed via my favorite lightweight mupdf application.

It's a lot of dependencies to make this one simple function work, but when it does, it can fly through a database of records and build an entire file tree of pdfs if needed (batch processing).

And it's very simple to write an elaborate program that takes advantage of this functional paradigm. I made this database browser that goes through my chat histories with llms with skim (sk/ fzf substitute):

export def sqlbrowse [] {
  open $"($env.HOME)/chatbot_conversation.db"
| (do {|x| ($x.conversation_history|sk|update chatbot_response (gum write --value $"($in.chatbot_response)")) } $in) | get chatbot_response
}        

Thereby, making a pdf of all of the conversations can be done with one simple shell command:

open $"($env.HOME)/chatbot_conversation.db" | (each {|x| ($x.conversation_history.chatbot_response | makepdf $"$x.conversation_history.id") })        

And you get a separate pdf file named {1 to n}.pdf in your currend working directory. I have 1000s of scientific articles that I have accumulated over the decades that I haven't had much time to digest. From a archival perspective, it doesn't make sense to have it in an PDF format if they are not read/used or shared. Furthermore, PDFs are not very LLM friendly since they have a very expensive attention vector on positioning of text. It's often better for text-heavy documents to be converted and archived in a cleaner text format that can be wrapped in meta data and piped to other interfaces particularly note-taking knowledge bases that are LLM based.

LLMs do a particularly good job at ingesting reading for you. And the hard part might be in data preparation, the first 80% of the extraction is knowing the right tools. For pdf extraction, poppler-utils has a pdftotext binary that converts pdfs to plaintext.

By doing so, you save 50% of the file size and give the content a 2nd life by being text greppable by popular utilities you know and love.

The really neat thing about batch processing is that it allows you to centralize your data for intelligence gathering and sharing to your other knowledge base interfaces. Enterprises are all about this kind of stuff. Since LLMs have hit the scene, corporations have been well underway in migrating legacy business documents into compilation data. But one of the hard things to port over are the day-to-day documents like Excel spreadsheets with a few VB macros to parse csv tables and generate reports.

That's three different languages, technically, 4 if you count the initial SQL commands under the hood to fetch the csv file. And probably 95% of businesses still operate like this day-to-day.

By moving a lot of the scripting into nushell, you can take advantage of SQL's powerful query language that is the standard and mix it with nushell's type in a shell for amazing utility. Below is a script that edits a chatbot response with a SQL UPDATE command using fuzzyfinder as a selector:

export def llmdbedit [] {
open $"($env.HOME)/chatbot_conversation.db"
  | query db "UPDATE conversation_history SET chatbot_response = :chatbot_response WHERE id = :id" -p (do {|x| ($x.conversation_history|sk --format {get user_input} --preview {get chatbot_response}|update chatbot_response (gum write --value $"($in.chatbot_response)")) } $in | reject user_input timestamp)
}        

This makes database queries simple and fun.

But how do we manage all of these new commands that we build? Nushell scripts can be invoked as I mentioned and I don't find them to fit as modules for public consumption. Rather, they are personal scripts that can generate my resume or coverletter for me without the use of python libraries. It's a system in a shell concept.

Personally, I found github gists works really well with nushell scripts. Simply write:

gh gist create utils.nu        

And github automatically creates the utils.nu gist for you with a link. You might have to authenticate with the gh-cli first with an SSH or GH_TOKEN but after that, you basically have a control center for querying a repository of nushell scripts. Here is a little helper function I made to get all of the gists with a file extension .nu piped into a fuzzy finder and saved locally as a gist.nu. This allows me to source it with source gist.nu to activate it.

export def ghnu [] {
  gh gist view (gh gist list | from tsv --noheaders | filter {|el| ($el.column1 | path parse | get extension) == nu} | sk --format {get column1} | get column0) | save gist.nu -f 
}        

This takes a lot of the guess work out of python library management for the everyday mundane clerical work that businesses use python for. Python is a terrible programming language for businesses because of its virtual environment management and versioning which makes every update a dance with the devil.

While nushell is technically a forever alpha project with breaking changes as it relates to strings. It's surprisingly robust and performant over python simply by being a batteries included type system.

What I mean by a batteries-included type system is when you source the nushell script, the function definitions immediately get upgraded to global shell CLIs. This means you get help functionalities and great error messages along with solving your current problem:

◄ 0s ⋈┈◎ : help makepdf
Usage:
  > makepdf <name> 

Flags:
  -h, --help: Display the help message for this command

Parameters:
  name <string>

Input/output types:
  ╭───┬───────┬────────╮
  │ # │ input │ output │
  ├───┼───────┼────────┤
  │ 0 │ any   │ any    │
  ╰───┴───────┴────────╯        

By comparison, in python you would need to import the click library or use the Tiangolo's Typer CLI library to gain some of the same access to your own utility builders.

One recent discovery I came across is this clever cli tool called pdfcpu. It does many things with pdf but one in particular that I found quite ingenious is PDF creation from a json data format.

I previously made a very verbose postscript parser in nushell that I felt wasn't maintainable or scaleable as I would essentially be parsing ghostscript in a nushell repl to make documents.

With pdfcpu, I can get a full cover letter written and formatted in pdf in less than a second.

xport def coverletter [] {
  # input | enscript -fCourier14 -B -p - | ps2pdf - coverletter.pdf; mupdf coverletter.pdf

  # let cv = importjson resume.json # local development
  let cv = http get https://gist.githubusercontent.com/shaoyanji/b7b844737e6469c9160bf41aa8970068/raw/resume.json
   let font = "Roboto-Regular"
        #let font = "Courier"
   let boldf = "Helvetica-Bold"
        #let boldf = "Courier-Bold"
  let fsz = 12
  let px = 96 / 2.54
  let lm = 2.5 * $px
  let tm = 4.5 * $px
  let rm = 2 * $px
  let prompt = "Write a job cover letter in German. Compose a brief and impactful cover letter based on the provided job description and resume. The letter should be no longer than three paragraphs and should be written in a professional, yet conversational tone. Avoid using any placeholders, and ensure that the letter flows naturally and is tailored to the job. Analyze the job description to identify key qualifications and requirements which are listed in yaml after the the job description. Introduce the candidate succinctly, aligning their career objectives with the role. Highlight relevant skills and experiences from the resume that directly match the job’s demands, using specific examples to illustrate these qualifications. Reference notable aspects of the company, such as its mission or values, that resonate with the candidate’s professional goals. Conclude with a strong statement of why the candidate is a good fit for the position, expressing a desire to discuss further. Please write the cover letter in a way that directly addresses the job role and the company’s characteristics, ensuring it remains concise and engaging without unnecessary embellishments. The letter should be formatted into paragraphs and should not include a greeting or signature."
        {paper: A4, pages: {1: {content: {
                text: [
                        {font: {name:$font, size: $fsz}, value: (
        $cv.basics.name
        | append [
          $cv.basics.location.address
          [[$cv.basics.location.countryCode $cv.basics.location.postalCode]]
          $cv.basics.phone
          $cv.basics.email
        ]
        | to text), pos:[(595 - $rm - ($cv.basics.email | str length) * 12 * .5) (842 - 5 * $fsz)]}
                        {font: {name:$font, size: $fsz}, value: ("Freiburg, " + (date now | format date "%d.%m.%Y")), pos:[$lm (842 - 8 * $fsz)]}
                        {font: {name:$font, size: $fsz}, value: (input  --reedline -d '\n\n\n' "Address> " ), pos:[$lm (842 - 12  * $fsz)]}
                        {font: {name:$boldf, size: $fsz}, value: (input -d 'Betreff: Bewerbung' "Subject> " ), pos:[$lm (842 - 14 * $fsz)]}
                        {font: {name:$font, size: $fsz}, value: ("Sehr geehrte Damen und Herren,\n\n" + (gum write --header="vG <Esc> gqq in vim" --value (groq ($prompt + ($cv | to text) + (input --reedline "Job Description> "))))), pos:[$lm (14 * $fsz)]}
                        {font: {name:$font, size: $fsz}, value: "Mit freundlichen Grüßen,\n\n\n\n\n\n\nMatt Ji", pos:[$lm (2 * $fsz) ]}
                ]
                image: [
      {src: "https://jisifu.vern.cc/signature.png", pos:[$lm (4 * $fsz)]}
      # {src: "./signature.webp", pos:[50 (4 * $fsz)]} # local development
    ]
    }
        }}}
        | save pdfcpu.json -f;
        pdfcpu create pdfcpu.json Ji_Matt_Cover_Letter.pdf; mupdf Ji_Matt_Cover_Letter.pdf
}        

And translated in German to boot with LLMs. But this does require some knowledge of vim shortcuts to format line widths as the llm json blob doesn't account for the text to be processed with its newlines.

This compose-ability is the key factor in nushell. While python can sort of do all of this too, it is thread limited and perform way worse than go for the micro-work that is needed for pdf generation and rust for the wonderful shell that just works out-of-the-box.

To view or add a comment, sign in

More articles by Matt Ji

  • 100 MB of storage with Nushell SDK

    I've made a simple Nushell SDK for the https://getpantry.cloud service where you can have a JSON store of 100 bins of 1.

  • Server-side WASM APIs is a Social Contract of the Web

    A tiny little experiment thanks to a free account at pythonanywhere.com can go a long way.

  • Ugh, compromises that lead to innovation

    Recently, I switched my resume into typst. And something I am looking forward to doing is compiling glyphs to replace…

    2 Comments
  • I've built my resume in Typst

    There are many ways to make a resume nowadays. But I found the coolest way to differentiate yourself in the job market…

  • TOTP without having to pay for premium Bitwarden

    Here is how you can get totp to work with bitwarden without having to pay for premium. Using nushell as a way to parse…

  • How to make your own offline AI browser to organize your browse history

    Requirements: An existing browser (firefox, chrome, or safari) Obsidian Web Clipper browser extension Ollama any local…

    1 Comment
  • Nushell Youtube Script

    There is a neat youtube script that allows you to fuzzy search youtube api from really simple tools and it looks…

  • MCP design: Decoupling context and Model Strength

    Many companies are getting on the bandwagon of Anthropic's tool calling API for LLMs called Model Context Protocol…

  • Kelly Criterion in multiple languages

    The kelly criterion is an equation that helps risk managers and decision-makers assess resource allocation on a…

  • Nushell > Bash scripting

    Scripting can be easy or hard depending on what kind of conditions or arguments need to be passed into the function…

Explore content categories