Week of 2025-12-01

Featuring travel planning and more LLM deployment

December 09, 2025

Hey all, late on the notes again this week. I'd blame the torrential rain or say I was making a new cool canvas, but the truth is that after I finished drafting an outline I opened Deathloop out of curiosity and lost track of time.

Oh well. At least the game is fun.

Travel/meetup plans

Most of my energy this last week went into travel planning for the next few months.

Seattle: Letta meetup

The "prolonged atmospheric river" in Portland for the last couple days is reminescent of the day I spent in Seattle last month. It was dark, wet, and blustery—and although I was prepared for it, I wouldn't call it cozy.

That said, I'm headed back up again this week. Letta has secured a venue this time for their social agents meetup, so I figured I might as well head back up.

I had a great time last trip--and this time, I'm staying overnight so I can attend the whole event. I'm excited for the self-hosting workshop, especially with my new LLM (which I'll discuss later)!

Stateful Agents Meetup: Social AI · Luma

Letta's first Seattle meetup! Learn about how ATProto and Bluesky are powering the future of artificial social intelligence. Join us if you're interested…

https://luma.com/social-ai?tk=TvrqmA

Vancouver: ATmosphereConf 2026

As I was feeling optimistic about socializing, I also committed to attending ATmosphereConf up in Vancouver. It'll be great to spend time in a city I love and to meet the folks who have helped me start breaking out of my shell.

I'm considering this the first conference that I've ever attended. I went to OSCON for a couple of years, but I was honestly only there for swag--my Stack Overflow t-shirt was a legendary haul. I wore nothing but tech brand t-shirts for probably 6 years of my life thanks to that convention.

Things are different now. I have professional experience in software development, and I think I'm coming into ATmosphereConf with a healthier and more productive mindset. We won't really know until we get there.

At any rate, I'm gonna make my way to Vancouver and have a great time. I hope to see you there!

ATmosphereConf 2026

ATmosphereConf is the global AT Protocol community conference Join us in Vancouver, BC, Canada, March 2026 at the UBC campus. Come early for two days of extended events, deep dives, and local activities, then gather for two full conference days with topics for everyone. We Can Just Do Things Together.

https://ti.to/atmosphereconf/atmosphereconf2026

Portland?

With all this travel, you might ask "Graham, why don't you go to meetups in the city you live in?" To which I would say, that's a great point.

Honestly, it's just because I haven't paid enough attention to local events. Travelling to Seattle is nice, but it doesn't beat the convenience of hopping on the MAX and going out for the evening.

All that said, I'm tracking some events now. I'll leave you with this personally helpful Sidetrail by fellow PDX local :

Make friends as a grown up

Steps to go make some friends in real life.

https://sidetrail.app/@brittanyellich.com/trail/3m7db273dmc2u

More LLM stuff

Lots of folks seemed to like my LLM self-hosting canvas from last week--shoutout to and for the boosts. I wasn't planning to spend more time on the project this week, but a pleasant surprise called me back to it.

I have a long-standing book club with a friend. He comes over once a week, and I cook us breakfast sandwiches while we discuss our reading material. It's always a great time.

This same friend also happens to be my #1 self-hosted LLM consumer--and this week, he gifted me with a 16GB VRAM GPU. Soooo... It was time to get a smarter model running. More VRAM = more room for a bigger model.

As I described in my canvas, parallelism is a requirement for my LLM setup. While we could probably fit a quantized 13B parameter model in 16GB of VRAM, I can only fit about 8B thanks to the KV cache.

This--plus all of my usage ideas--leaves us with an awkward set of requirements. My ideal model would have:

Reasoning capabilities
Tool-call capabilities
A large context window (>32K tokens)

I could only find one model that satisfied these requirements: Qwen3-VL-8B-Thinking. I would have preferred a model without vision capabilities since it adds to the tensors size, but the only option then would be the standard Qwen3-8B-Thinking, which doesn't pass the context window requirement.

Our results so far have been mixed. It's certainly a lot less anxious in its reasoning output, and its one-shot output has been generally spot-on. However, in my short experience using it for chat, it just sort of... Forgets about stuff earlier in its context window. That's not ideal for my agentic usecase.

This doesn't really come as a surprise. Models that come in Thinking and Instruct variants typically tell you that the Instruct version is better for chat. I don't fully understand why yet, but my results so far corroborate this claim.

This week? We'll try out an Instruct model. We're using it primarily for codegen anyways.

Holy smokes, I spent a lot more time yapping than I intended to. If you got all the way down here, thank you--from the bottom of my heart. Your time means a lot to me. Have a great week!

A Pocket for my Weeks

Weekly notes by Graham