June Update: Discovering Discovery | Steve Sokolowski

Introduction

We're back, to post an update on the case now that half a year has passed. In the interim, the models have continued to improve. Now that the defendants' briefs have been posted, we have grown more confident in the strength of our case and the ability of models to assist us in winning it. It's one thing to read a simulated dismissal motion, and then another to find that all the relevant arguments in the actual brief are nearly identical to the model outputs.

A brief note: this website will be the canonical source of these posts moving forward. As usually occurs once one gets established on social media sites, eventually the moderators begin censoring posts; this has happened consistently to us over the past twenty years every time an account gets established. While you may see the full text on other sites, stevesokolowski.com will be where the original posts are preserved in perpetuity.

Recap

Much of the time since January has been spent in rounds of dismissal briefs. After the defendants filed their first motions to dismiss, we amended our complaint with 250 pages of additional evidence. They then filed to dismiss the new complaint. Throughout the entire time, we had been reading an estimated 1,000 cases, about 150 of which we ultimately referenced in our final opposition brief. It was agreed that the defendants would be permitted to respond on July 1, 2025. All of the existing filings are available on this site in the "Case Updates" section.

Simultaneously, we filed a complaint in the District of Connecticut, alleging the same facts as are alleged in the Middle District of Pennsylvania. We requested that the Connecticut court stay or dismiss that case, so that if the Pennsylvania case is dismissed for a non-merits reason (such as a lack of personal jurisdiction), the statute of repose will not have expired in Connecticut.

Attorneys

The most obvious question six months into the litigation is whether we would have been better off hiring an attorney or not. One would think the answer is "of course, yes," and we were at the time, but we're no longer certain of that - even though it took more than 350 hours to write the response brief. Before filing the Connecticut action, we contacted ten lawyers in Connecticut. Nine of them didn't reply, and the one who did was asking me what the "MDPA" was.

This experience is in line with my previous encounters with attorneys - and our belief is stronger now that even the obsolete o1 was superior to every attorney we've ever met. Every attorney I've met has either asked us to do research for him (in 2017, an attorney we contacted to determine the legality of PROHASHING asked us what we thought,) held useless meetings to charge more (like having two attorneys present in the same room to double the cost,) offered to provide useless services (like trying to get Zac Prince criminally prosecuted,) or taken five years to accomplish what should have been done in two (my mother's divorce case.)

The world (and perhaps the American Bar Association) makes it seem as if attorneys are God-like. But if you read the defendants' briefs, they support our other experiences in that a large proportion - even a majority - of lawyers display a lack of professionalism towards the public. Many attorneys are narrowly focused on very specific types of cases that they repeatedly churn through, and have a surprising lack of depth.

If you've never needed an attorney, then it's hard to understand how pervasive this situation is. If you want to find out, though, make up a case and copy and paste the details into many online attorney website forms. Even if you are extremely detailed and use the proper legal terminology like we did, the odds are more likely than not that you will receive zero replies. What's amazing is that being disrespectful like this isn't good business sense; if you want to maintain a good reputation, you reply to messages even if the answer is "no," so obviously most attorneys appear not to prioritize their long-term reputations.

A MAJOR Disconnect

That brings us to the next observation - a severe disconnect exists between the online community and the models. And this isn't even close - it's not like the commenters are saying "you have a 60% chance of winning but the models say 80%" or the other way around. The current situation is that many online commentators have been sharply negative about the case, while all models consistently agree that the case is strong.

We ran a monte carlo simulation with 20,000 different paths through a number of branches on May 26, to finalize the specific words for the response brief and courses of action we could take. In that simulation, o3 simulated actions the defendants might take, along with differing probabilities of their success; the probabilities depended upon various external factors. We chose the arguments, wording, and actions from the highest expected value out of all the available true facts, and that expected value was quite high.

So, unintentionally, a decisive experiment has been set up where we will find out if the models are right, or the humans are right, and the result is going to be definitive. If the result is that the models are flawed, then that result would likely indicate that it will be 5 or more years until LLMs can provide even basic legal information. Because so much research - 10+ hours per day - was done with the models, the loss will certainly not be due to humans (us) "making a mistake" - it will be a rejection of the idea that models, at least at current sizes, can understand the law.

My suspicion is that the disconnect is caused by the commentators not having spent the 300 hours we did reading the evidence before filing the complaint, and the models will prove correct because their context windows are large enough to consider all the evidence and case text at once. But it's also possible that there is some fundamental flaw in the models that consistently biases them all towards a specific result - like for example if a specific important case was not included in the training data. I suspect that if the models do cause a loss, it will be because the models were trained on the wrong data.

Context Windows

Speaking of context windows, the most significant improvement in AI tools since December 6 - when o1 pro was released and it became feasible to pursue this case - have been both the dramatically increased size of context windows and the dramatically increased ability of models to understand everything that is in the window. The current free version of Claude 4 Sonnet has an incredibly short context window, certainly less than 32KB. In pre-singularity times, this increase would have been considered an extraordinary improvement - in just two years the context increased 8x from GPT-4's 4096 tokens. Yet, now that the Gemini models offer more than a million tokens, Claude 4's "small" window seems obsolete.

The large context window of the latest Gemini models allows us to put the entire case into the model when asking it questions. Now that the Gemini 2.5 Pro Preview 06-05 model has a 90.2% recall rate, it never makes mistakes anymore like the OpenAI models do - those models easily "forget" key facts, which can then lead to catastrophic errors. That's why every action we take always goes through at least three different models.

One vastly underappreciated part of models is that they can chew through briefs - text is what these models do. They can pick out any contradictions whatsoever. You might have noticed that the defendants have a massive number of contradictory arguments in their briefs. Gemini Pro 2.5 Experimental 03-25 allowed us to quickly gather a list of the contradictions so that we could then determine their importance.

There was a time in the past where so-called "structured" data was believe to be necessary - for example, that to make things searchable it would be required to turn them into JSON files with specific attributes. It's clear now that will never be needed; the future is simply writing lots of data in human-readable text and putting it in models' context windows to reason over it.

Workflow

We began researching the response brief to these dismissal motions before the complaint was filed, because we needed to know whether the case was viable or not before posting it to the docket. Thus, what might seem like an enormous effort in only five weeks was actually underway well before their first dismissal motion was filed. The main challenge was reviewing the specific cases the defendants cited in their briefs. The cases were mostly subtly off-topic in one tiny way - for example, being a New York case in a Pennsylvania suit - and it was easy to overlook the small difference even after reading the entire case text.

As stated earlier, and as we pointed out in our briefs, the defendants' briefs are confusing, contradictory, and disorganized. But, word order is only one input into an LLM, so the order of the arguments doesn't phase a model. After struggling to understand it all, we simply put it into Gemini Pro 2.5 Preview 03-25, and asked it to output the document again, but reorganized with their strongest arguments first and with the most confusing chains of thought summarized. And once we figured out the method, the models were able to quickly figure out the subtle differences that made 90% of the cases irrelevant.

Next, we had the models output a full response brief for each argument section. These initial briefs weren't used, because o3 hadn't been released at the time the briefs were started and previous models were prone to case hallucinations - but we could read them and get a general idea of how to respond (or how not to.) By the time the newer versions of Gemini had been released, we already had an overall draft. I suspect that next time, we will be able to use exact output text and not need to double-check the cases - even though we will check the cases anyway.

Finally, and most importantly, I send nothing, no matter how trivial, to the defendants' attorneys or to the court docket without being run through two models in its entirety. The models are given the full context of the case and asked to perform a final check, even if it's just a simple E-Mail. Attorneys are not friends - and DCG/Silbert's attorneys treat us as a step even below enemies - so we don't care if the messages sound overly formal or "AI-generated;" words used in legal cases are just like formal symbols used in math equations.

Checking Cases

We read every word of every cited case that was used in the briefs. However, to confirm that our understanding was correct, we took the full case text, one case at a time, and asked a model to confirm that what we wrote is supported by the case text. Only one error was found, and that error would not have invalidated the argument; it would have been a misinterpretation of the case's findings, even though the case supported the argument in another way. Therefore, we can say that the models, when context is short, are now near-perfect at cross-referencing two documents to ensure that they match.

While o3 doesn't make many mistakes searching the Internet for cases, it never makes a mistake when the full case text is pasted into the context window. That's why the defendants' violation of the Local Rules in not including the unpublished opinions was so damaging to our response, and why we pointed out the rule violation in the brief.

A flaw that was prevalent in the earlier versions of Claude, and which led us to discontinue our Anthropic subscriptions to save money, is that Claude 3.5 Sonnet (New) would find cases on the correct topic, but the judge's ruling would actually be the exact opposite of the argument Claude was trying to support. It would select quotes that were contained in the opinion, but the quotes were often at the top of the ruling, and then later in the ruling there would be a statement like "all of that was true for those other cases, but this case is different and therefore that doesn't hold here." This issue has largely been solved in the latest competitors' models, and pasting entire case text into the context window eliminates it entirely.

The way that we determine if a course of action is viable is by actually writing the motion to take that action. We have the models output the actual motion with the case citations, and then read all the cited cases to ensure that they support the text of the sample motion. Then, we ask for more cases, until we have a full brief. We won't file the brief in many situations, but it is trivial to output text like this. While the point is to protect against a situation where the cited cases are inaccurate, at the end of this process with the latest Gemini versions we have never ended up with a situation where the course of action was unsupported by case law. This process is how we determined that moving to stay the Connecticut case would be properly supported by the law.

Simulations and Strategies

I discussed earlier how we had run Monte Carlo simulations of the case, as simulations are a critical part of our workflow. However, these are not the only kinds of simulations we run. We now go even further than just numbers.

Models now are able to output massive numbers of tokens, whereas back in the GPT-4 days, one would be lucky to get a few hundred tokens of output before OpenAI timed it out to save themselves money. By simply stating "be excruciatingly detailed" in the prompt, one can now get entire simulated opinions.

There's another innovation, however, which has made things even more accurate. Now that all of the major providers have given their models Internet access, we can instruct models to act as if they were the judge who will be ruling on the case. The best way to do this is to set off a Deep Research report on all the opinions the judge has written and his or her views, and then input that into a new prompt, asking for the model to predict the judge's opinion. We can then see strengths and weaknesses in the case.

For this latest brief, we ran these simulations over and over, both generating opinions and asking the models to rate arguments from "very weak" to "knock out." In our final review, we ended up with two arguments rated "knock out," all others rated "very strong," except that for some strange reason the procedural rules the defendants unequivocally violated were only rated "strong."

o3 vs Gemini

As a short aside, Gemini is the clear leader and has surpassed OpenAI. If the normal Gemini interface supported the attachment of audio files, I would cancel our OpenAI subscription and switch to Google. Google continues to widen its lead with every release.

Whatever advantages in intelligence the Gemini models have, it's clear that o3 cannot deal with even a moderate amount of context. o3 seems like a regression in hallucinations from the GPT-4 series (Gemini 06-05 is nearly perfect) and just makes up random conclusions from cases when its context window is filled with even 20,000 or 30,000 tokens.

Deep Research

Deep Research is the most valuable tool available today. We have become confident enough in OpenAI's Deep Research reports that in most situations we would prefer a Deep Research report over an opinion letter from a human lawyer.

When writing legal documents, I run as many as ten Deep Research reports in a day, sometimes concurrently. Examples of topics that we requested Deep Research on include:

Missing arguments in each of the 7 sections of the brief
An analysis of periphery cases, such as the class action and the Genesis Wind Down Estate's action
Detailed analyses of key areas of law, such as equitable estoppel
The possibility of unintended consequences of the Connecticut action
Service requirements in Connecticut (Deep Research found the critical rule that the clock stops on service, not filing, there)
Case law related to arguing in the alternative
Locating the residential addresses of the defendants to serve them
Finding assets of the defendants
An analysis of causes of action in our next suit against Zac Prince of BlockFi: https://prohashing.com/blog/crypto-lending-untruths-part-1-blockfi

Deep Research is able to find amazing stuff - likely from public records that are not obviously available on Google. It found, for example, that, while I pay $750/month in rent for a small suburban house, Defendant Moro lives in a 30+ story penthouse in Brooklyn with $3 million in equity and a $3 million loan.

Additionally, after the defendants returned the forms rejecting the magistrate judge's authority over the case, we ran multiple Deep Research reports to determine what cases are on the District Judge's docket. We found that many cases in the system are prisoner appeals and social security disability benefit issues; cases like this one appear to be somewhat uncommon.

When we started out with PROHASHING, we were delayed for 6 months because we could not find an attorney willing to write an opinion letter on how we needed to structure the business, and paid $3,000 for the attorney who finally wrote one. Today, we would have run Deep Research and had the information we needed within 20 minutes, for $20 (if we only had the Plus plan.) In one way, we aren't disappointed, however, as the estimated $1 million we would have made during that time would just have ended up being stolen by the Defendants five years later anyway.

Learning As You Go

One critical limitation in that we can't become experts in every area of the process at once. Having worked at this for 60-80 hours a week for 6 months, we are essentially students entering the second year of law school.

In 2024, we examined each phase of the litigation to become familiar with how everything works. But right now, we are experts in the dismissal phase and idiots at bench trials, and are still learning how the case will continue in the bankruptcy court once DCG declares.

On Monday, we will begin performing Deep Research into how discovery works in Federal cases. At least for this case, we'll always be learning as we go, trying to stay ahead of the next phase. Our goal is to use this knowledge as we begin to transition our attention later towards the fraudsters involved with the cryptocurrency firms unrelated to Genesis.

Barriers Against Self-Represented Litigants

Moreso than not knowing how to proceed, however, are the major barriers that have been put up to perpetuate the high salaries for lawyers and ensure the legal system works well for those with money. Of course, this situation isn't unique to law - the same is true in medicine, where for 19 years a psychiatrist has been paid an estimated $23,180 so that I can repeatedly say that I'm doing perfectly well and get the same drugs prescribed, over and over. These (and other) industries close off the system in the name of "protection" or "safety," but the barriers really exist primarily to enrich those within the system.

With the District of Connecticut, I am not impressed with anyone except for the judge.

We filed our case on May 29. If we were attorneys, we could log into the PACER system, create a new case, upload the complaint and paperwork, request summonses, and get them to the Connecticut Marshals the same day. Attorneys can do everything except the Marshal service and the filing fee for free.

Since we are not attorneys, we have to do everything by mail. It cost more than $100 in overnight mail labels to mail 500 pages of paperwork to the court. Then, they didn't post it to the docket until four days later. Then, we had to file motions for electronic access, which have not been ruled upon yet. Since they haven't been ruled upon yet, we have to send letters back and forth to get the paperwork required for service. All of this is for a case that we hope is stayed or dismissed with leave to refile.

Had the defendants not waived service, our intention was to drive 600 miles to Connecticut and back on Friday, June 6, 2025 and go between the courthouse and the Marshals, with a printer and two computer battery backups in the trunk of a Prius, dealing with the paper shuffle that only pro se litigants are required to deal with. But the service waivers are still not posted to the docket even though they were sent days ago.

One time, Chris had to hang up on whoever had answered the phone at the court, because she simply was unable to help. Chris reiterated the time criticality to the clerk, and the clerk refused to help, stating that the electronic filing motion had to be approved by a judge, and a "PACER training course" was required, despite his already having filed tens of PACER documents.

So, as a result, if this case ever gets to Connecticut (and we profusely hope it will not), due to these pro se requirements we might be required to write lengthy briefs arguing for equitable tolling, and need to appeal that decision if denied. This is an ironic situation, because the sole purpose of the case is to ensure our rights aren't lost due to the statute of repose. What a waste of taxpayer money paying judges to have to read arguments that could be avoided by simply allowing people to register and post their case! You should care, because you may end up paying for this.

What We Don't Know We Don't Know

But back to Pennsylvania, which is the real focus where everyone wants the case to be decided on the merits. I'll conclude with what I believe is the most likely way that we could fail in this case: by not knowing that we don't know something.

Over many years, I'm always surprised how often I will get concerned about specific issues in any area of work and work hard to make sure they don't become a problem. What I've found is that almost always, those focused issues are solved - but the focus on those issues led me to not consider something simple that was overlooked.

So, my main concern is not that the arguments in the brief are weak. With perfect execution, we will win the case. The issue is that we can't achieve perfect execution because we are not attorneys, and there is guaranteed to be something that we have overlooked that is far outside the norm. For example, we may unintentionally violate a rule in a document that we don't know about that causes a loss - or worse yet, sanctions that bankrupt us. We might do something that, while permissible, is not often done by attorneys, causing human factors issues where the judge looks upon us negatively. We aren't aware of how auxiliary issues that affect cases work, like D&O insurance, and the interplay of these outside issues with what is happening on the docket could cause a loss. I like to categorize these issues as "rich man's issues," the type of things that people in the upper class who earn a lot of money deal with but normal people don't have a clue about.

Why This Case Is Important

The battle continues. Whereas before we would have considered an easy out, the defendants' behavior - specifically in the criminal allegations they have now made against us within their briefs - has made it absolutely certain that this case will end in summary judgment, trial, or bankruptcy court. Their communications to us demonstrate that they are not professional people who can be reasoned with for settlement. And, quite frankly, someone who claims that you are a sanctionable criminal fraudster can't be trusted to actually honor any settlement. All of the information needs to come out in discovery for the world to see, a public bench trial will show everyone the harm these people have done.

The defendants may settle all the other cases against them, so that they can retain a few million dollars and not end up in bankruptcy. The large law firms in those cases represent huge numbers of people, and they have a duty to get something for their clients, not everything.

It is likely that it will fall to us to ensure they end up living like us. Nobody is going to be residing in $6 million penthouses after ruining a million people's lives while we still breathe. The case started out about money. Then the defendants accused us of fraud. Now it's about ensuring they don't do any of this to anyone else ever again. If we are still using an army of AI agents to seize their cars and levy their bank accounts in 2040, then so be it.

That's why this case may be the most critical of all of these cases - it could be the last one standing - and be the only way to finally put an end to the defendants' ability to do harm to the world with the power their money brings. Our mission is to show these people what it means to have to get a job and work for a living, reminded every paycheck when their wage garnishment goes to make things right.

Conclusion: Discovering Discovery

We are spending the next few weeks performing research "Discovering Discovery," and preparing a plan of issues like what evidence we need and who to depose so we aren't blindsided when the judge starts ordering conferences.

I encourage everyone to read the response brief and bring themselves up to date on the case. Most of the secondhand commentary is inaccurate because it misunderstood the evidence. We'll post another update once the R&R report is published, if we decide to proceed against the other fraudsters now, or if one of the related Genesis actions has a ruling. In the meantime, you can listen to "Atlas In The Junkyard" (https://stevesokolowski.com/songs/atlas-in-the-junkyard/), which was inspired directly by the Defendants. Until next time, we'll continue "lifting shards of yesterday toward a humbler sky!"