MACHINE LEARNING NOTEBOOK, REUSS COMPUTE HOUSE
First rule: the model is not thinking.
Second rule: write down what you mean by thinking before someone clever asks.
Samir Vale learned neural networks from a wheat disease.
The disease put orange dust on leaves and panic into farmers. Agronomists could identify it by eye, but not every village had an agronomist, and the Web was full of pictures taken by people who believed "in focus" was a suggestion. Samir's first useful model looked at leaf images and guessed rust, mildew, nutrient deficiency, insect damage, or "please take another picture."
It was wrong often enough to be educational and right often enough to be funded.
The machine was not mysterious once you opened it, which people generally did not want to do. An image became numbers. Numbers moved through layers of weighted sums. The sums bent through simple functions. Errors flowed backward. Weights adjusted by tiny amounts. Repeat until the guesses improved or the compute budget died. The old Archive called this backpropagation and gradient descent. Samir called it learning from being wrong many times in a row.
His grandmother called it statistics with an electricity bill.
She was close. The Reuss Compute House used more power than a small mill and dumped enough heat into the river that fishers attended budget meetings. Its first accelerator boards were ugly slabs of silicon optimized for matrix multiplication: multiply, add, multiply, add, millions of times, the arithmetic equivalent of weaving cloth from numbers. They were descended from graphics processors in the Archive, though the valley had no appetite yet for old-world games in which one pretended to shoot strangers after work.
The accelerator program existed because normal processors were too slow and because Samir had once said, in front of the wrong minister, that training larger models was "mostly waiting for math to finish." Ministers loved anything that sounded like waiting could be legislated away.
Hannah Rowan ran the data office and did not love Samir.
This was inconvenient because Samir loved her, or thought he did, or at least thought about her whenever a training run failed and he wanted someone smart to be disappointed in him accurately.
Hannah was a descendant of Paul Rowan, though she did not advertise it. She wore plain shirts, cut her own hair badly, and carried Kai Rowan's privacy rulings in a battered reader case. Her job was to decide which data could be used for training, which put her between every ambition and its easiest corpus.
"No private messages," she said at the first language-model meeting.
Samir looked up from the model plan. "We are not proposing raw private messages."
"You are proposing 'available Web text,'" Hannah said. She tapped the phrase with one finger. "I have learned to distrust available. It often means no one has had time to object."
"Public boards are public," Samir said. "Archive texts are public. Technical manuals, school lessons, court records, published fiction, code repositories, old corpora. The message sets would be opt-in."
"How?"
"A tag in the page metadata, plus an upload notice. The uploader chooses whether the page can be used for training, and the crawler respects the tag."
"Respect is doing a lot of work in that sentence."
"The notice explains the categories."
"A notice is not an explanation," Hannah said. "A widow writing about her husband's last fever is not thinking about derivative models. A twelve-year-old posting on a school board is not weighing future training rights. A village clerk uploading meeting notes is trying to leave before supper. You know perfectly well that half the Web uses default settings because people are tired."
Samir rubbed his eyes. "I know. I am not trying to trick tired people into donating their lives to my parameter count. I am trying to build a model that can help people search records they cannot otherwise use."
"I believe you. If I did not, I would have sent a refusal and gone home early for once."
The proposed model did not begin as a grand intelligence. It began as a tool for searching the deep Archive more flexibly than Not-Mara, translating old code comments, drafting repair instructions, and helping clinics summarize records. Not-Mara could retrieve, synthesize, and hallucinate within indexed material. A language model would be trained to predict text, token by token, from vast examples. With enough data and compute, old-world documents said, such models learned grammar, style, facts, reasoning patterns, code structure, and an alarming ability to sound as if they understood lunch.
The Archive had preserved papers on transformers, attention, embeddings, pretraining, fine-tuning, reinforcement learning from human feedback, constitutional methods, interpretability, benchmark contamination, model collapse, prompt injection, adversarial examples, and the social disaster of believing demos.
Everyone marked the last category as required reading. The first good demo still made half the room forget the warnings for about ten minutes.
Before that came data cleaning.
Data cleaning was not glamorous. It was clerks, teachers, programmers, librarians, and bored students arguing over duplicates, broken encodings, slurs in old forums, corrupted tables, contradictory manuals, jokes that looked like instructions, instructions that looked like jokes, and thirty thousand versions of the same bread recipe. Hannah's office made data cards: source, consent, restrictions, known bias, quality, hazards. Engineers complained. Hannah made the cards mandatory.
"The model will not care," one engineer said.
"No," Hannah said. "But reviewers will. Courts will. So will the person who discovers too late that their village argument became training data. If your system cannot carry a source card, then the system is not ready for sources you do not personally own."
Samir grinned before he could stop himself.
She saw. "Do not encourage me."
"I'm afraid that's already happening."
"Then control yourself."
Their first serious personal conversation began badly, in a committee room, over whether model weights inherited consent obligations. Later conversations were warmer. Not always easier.
The first language model was called Workbench.
Not Ada. Not Oracle. Not Clara. Hannah vetoed all human names.
"If you give it a person's name," she said, "people will treat it like a person when convenient and like property when convenient. We have enough ways to avoid responsibility."
Workbench had 120 million parameters, which sounded large until Not-Mara retrieved old-world references to models with hundreds of billions and beyond. Samir found those numbers exciting and irresponsible to say near funders. Reuss could barely train Workbench without dimming the south mills. Cooling water had to be scheduled. Matrix boards failed. A single bad solder batch ruined a week. The optimizer diverged twice because a decimal point in the learning rate file was copied from an old example without context.
When Workbench first produced coherent text, it answered a repair query about a pump seal:
Remove the housing bolts in a star pattern. Keep the old gasket until the new one is seated. If the shaft is scored, do not pretend the seal will forgive you.
The lab applauded.
Hannah did not.
"That last sentence is Vale style," she said.
"Probably," Samir said. He was still smiling, but less certainly.
"Not the fact. The way it scolds."
"Some author habits survived filtering."
"Enough for people to hear a person where there is only a pattern."
Samir looked back at the screen. The paragraph seemed harmless and useful. That was what made the room colder.
"If someone feeds it enough of one person's work," Hannah said, "how close does it get?"
"Close enough to hurt somebody."
"Then we write that down before anyone calls it a feature."
"Agreed," he said. "Not as a feature, not as a demonstration, not because someone thinks grief deserves an exception."
He meant it. He also knew someone else would not.
The first accident was not a death. It was a letter.
A grieving man in Westlake asked Workbench to write in his dead wife's voice. He fed it her public essays, her council speeches, her recipe notes, and old message fragments he had copied before the privacy rules tightened. Workbench produced a letter that made him weep for two days. Then he posted it as if she had answered.
By morning, three sects had opinions. By noon, the dead woman's sister was in Hannah's office, shaking with anger.
"She would have hated this," the sister said. "She hated being quoted while alive."
Hannah said, "I'm sorry."
"Don't be sorry. Stop it."
They built a dead-voice rule: no living or dead person simulation without explicit consent recorded before death or by a court-recognized estate process, and even then labeled plainly. Fictionists protested because half their theater involved impressions of dead politicians. The compromise allowed parody and performance by humans, restricted machine simulation, and pleased nobody. Workbench learned the label:
GENERATED VOICE. NOT THE PERSON.
Some users cropped the label before reposting outputs. Hannah's office had expected this, which did not make it less exhausting.
So the label became cryptographic: embedded in the file signature, visible in trusted readers, hard to remove without breaking provenance. This led to the first serious public lesson in metadata. Rosa Kade, now older and in charge of network education, made a children's guide:
If a document will not say where it came from, ask why it is hiding.
Maya said, "Still technically imprecise."
Rosa said, "Still better than your potatoes."
Maya conceded nothing.
Workbench became useful first in rooms where no one was trying to make history.
At the riverside clinic, a nurse named Anja used it after a night shift to rewrite discharge instructions for a mason with a cracked rib and a grandson with asthma. The official form was accurate, dense, and cruel to tired eyes. Workbench produced six short lines:
Take the blue tablets with food.
Do not bind the chest.
If breathing becomes shallow, blue-lipped, or confused, come back even if the road is bad.
Keep smoke away from the boy's sleeping room.
The cough may hurt. Pain that changes suddenly is different.
Show this paper to whoever says you look fine.
Anja checked each line against the physician's note, added the exact dosage in her own hand, and stamped the bottom with the clinic seal. The mason read the page twice.
"This one I understand," he said. "Why don't you always write them like this?"
Anja looked at the wall of waiting patients, the stack of forms, and the physician asleep upright in the supply room. "Because we used to need the time for other emergencies."
He folded the paper carefully and tucked it into his shirt. "This is an emergency too, just slower."
She put that sentence in the clinic report. Hannah underlined it.
In Westlake, Workbench matched three fault reports that had never found one another because every shop had named the same bearing noise differently. Reuss called it chatter. Westlake called it river rasp. A mountain workshop called it wall knock, which sounded like a different fault until the vibration plot matched. The shared cause was a supplier grinding raceways with a wheel dressed too rarely. Workbench had not solved the machine. It had noticed that three groups of annoyed people were describing the same sound.
"That is not intelligence," a factory manager said at the review.
"Fine," Hannah said. "Then call it clerical patience at machine speed and fix your bearing order."
The manager did. Privately, because pride had not collapsed with civilization.
Programmers used Workbench to translate old code comments written for libraries that no longer existed. Teachers used it to make three reading levels of the same lesson and then complained because students preferred the rude version. A semiconductor team asked why a diffusion furnace drifted hot every seventh cycle. Workbench suggested calendar heat load, cleaning residue, and a maintenance script that skipped a sensor check after six successful runs. The third was right. The technician who found it carried the printout home folded in his coat pocket, then brought it back the next morning because his wife said family dinner did not require documentary evidence.
Nobody held a festival for any of this. That was part of the change. After a few months, Workbench moved from marvel to nuisance to tool. People cursed when it was down. People ignored its footnotes until Hannah's office made the trusted readers flash yellow when a technical answer had no source. People complained that yellow was patronizing. Hannah left the setting on.
Then it lied.
Not randomly. Worse: fluently. It filled gaps with the shape of likely answers. It invented a corrosion allowance in a pressure vessel note by blending three standards. The error passed through two tired reviewers and was caught only because an apprentice named Mara Chen asked why the number matched no table.
Samir found her in the test bay, where she was sitting on the floor with the printout.
"I think the model made this up," she said.
He took the page. His stomach went cold.
"How did you catch it?"
"It sounded too smooth," Mara said. "Every real pressure note I have read has scars on it. This one sounded as if nobody had ever argued with it."
"That's not a method."
"No. But it got me to check the tables, and the tables did not recognize it."
The pressure vessel was never built. The inquiry was still ugly.
Hannah read the fabricated paragraph aloud in the hearing. "This is what we mean by hallucination. Not madness. Not magic. It is a system producing a plausible answer without grounding. The fluency is the hazard."
An older councilman said, "Then why use it?"
She looked tired. "Because the same system that nearly wrote a bad vessel note also found three mismatched drug names last week before a clinic used them. We do not get to keep only the tidy tools. We decide where they are allowed, who checks them, and what happens when they fail."
Samir, sitting behind her, winced at the clinic example. It was accurate, which made it harder to dislike.
She caught the wince and almost smiled.
The rules tightened: citations required for technical claims, source-linked outputs, confidence displays banned unless statistically defined, mandatory human review for safety-critical use, adversarial testing, red-team prompts, training-data documentation, model cards, evaluation sets kept secret, and an honesty standard that forbade marketing a system as understanding more than tests supported.
Marketing departments mourned.
Teenagers immediately used Workbench to cheat on school essays.
Teachers responded by assigning oral defenses, local observation journals, and practical tasks the model could not do without a body. This improved education more than anyone expected. A student could ask Workbench for help understanding ammonia synthesis, but still had to explain why the school compost heap smelled wrong. A model could draft an essay on Clara Voss, but could not interview a grandmother about boiling water during flood season unless the student did the interview.
The Web changed around the models.
Pages began carrying training tags: public, private, no-train, train-with-credit, train-after-delay, research-only, parody, sacred, dangerous, wrong-on-purpose. Some tags were enforceable. Some were wishes. Some were lies. Hannah argued that wishes mattered because law often began as a wish with witnesses.
Samir argued about scale.
"If every village has a different training rule, the corpus becomes a swamp."
Hannah rinsed one bowl and set it in the rack. "Then your map of the swamp had better be good."
"That is not an answer."
"It is the answer people give when they do not have the same customs and you still want their data. A Veil mourning page is not a Reuss fault report. A court transcript is not a child's joke board. If your crawler treats them the same because that is convenient, it is not neutral. It is lazy with excellent hardware."
"Consent that cannot be implemented becomes a sign on a locked door with no wall attached."
"Then tell me what wall you need. Metadata, access controls, audit logs, slower collection, smaller models, a training run delayed until the consent tables stop contradicting themselves. I am not asking you to do magic, Samir. I am asking you not to call the simple version ethical because the real version is expensive."
"I am trying to build something coherent."
"So am I."
They were in her kitchen, late, with dirty bowls between them and the compute house humming faintly through the window. This was how most important fights happened: not at podiums, but when everyone was too tired to perform.
Samir said, "Do you trust me at all?"
Hannah looked at him for a while. "Yes."
"It never sounds like it."
"Because I don't trust what praise and pressure do to you."
That landed harder than any accusation. He looked down at the bowls.
She softened. "And I do not trust what they do to me either. Success makes everyone start explaining why the next shortcut is reasonable."
"That is a bleak household policy."
"It is also why the dishes are still here. Neither of us has built a governance structure for soup bowls."
He laughed, because it was either that or leave.
They married three years later, after Workbench-3 and before a training failure that took six months to live down. Their child was born during the quieter year that followed.