AI pioneer, Sandra Carrico, joins Ledge in this fascinating episode where she recounts the early days of AI, the early 80s AI winter, and the dawn of computational linguistics.
Now the VP of Engineering and Chief Data Scientist at Glynt.ai, she works daily with the company’s ML systems to refine the patented algorithms she invented to extract data from documents such as utility bills, insurance claims and medical records with better-than-human accuracy.
Sandra delivers fascinating insights into how she turns problems into solutions with a mix of mathematical know-how, coding experience and management skills.
Sandra holds the dual roles of Vice President of Engineering and Chief Data Scientist for GLYNT As Chief Data Scientist, Sandra invented the GLYNT system for data extraction from documents. Her advanced machine learning approach has been presented at technical conferences and her writing can be found on ARXIV and Medium.As Vice President of Engineering, Sandra led the development of the GLYNT product, including the supporting Elastic AI Workbench which pipelines and orchestrates models, data, documents and workflow. Previously Sandra was VP of Engineering at a number of startups. She has also been in engineering management at AT&T Bell Labs, Aurigin and AT&T Labs.
Ledge: Sandra, it’s great to have you on. Thank you for joining the show.
Sandra: Thank you. I’m really excited to be here. It’s going to be a lot of fun.
Ledge: If you don’t mind, would you give a two or three minute introduction of yourself and your work? I want the audience to get to know you a little bit.
Sandra: Sure. I’m the VP of Engineering and Chief Data Scientist at GLYNT.AI.
GLYNT liberates data that’s trapped in documents. We also respect the corporate data governance requirements that a lot of enterprises have today. What I like to call it is, your data, your models.
I developed a novel machine learning algorithm. Using that, we get about 98% accuracy and we only require around seven examples to train. This system is offered via SaaS.
This whole business started as a problem, where we wanted to free programmers from writing code to extract data from utility bills. We knew that this thing wasn’t going to scale very well and very cost-effectively, and so I started to bang my head against the wall to figure out how we could more effectively and more inexpensively extract this data from utility bills. What ended up happening is I accidentally stumbled onto a solution that would extract data from all kinds of invoices or lab care records.
That’s where we are today. It’s fascinating how big an idea this turned into.
Ledge: You, off mike, talked about – I think this is fascinating – that you were involved in AI going quite a ways back. That you saw the rise and fall, or so you called it, the AI winter of the ’80s.
Maybe tell us some of that story and how that informed your thinking pattern.
We have a lot of people now interested in, I guess, AI 2.0. Everybody wants to get involved with this now. We have speakers on our calendar, and our fridge can automatically order stuff or whatever. Knowing some of that history maybe and that thinking pattern that informed your process to solve big problems, I think that’s just super interesting.
Sandra: Yeah. I took a bunch of classes in grad school. I was at Bell Labs and they had a program there where they’d send you away to school for a year and they paid everything. It’s like I got new parents. My new supervisor, or dad, would call me every week and ask what I needed, and he’d FedEx anything I wanted. I could take any classes.
I took a lot of math, a lot of statistics and I also took AI because it looked interesting. In one of the classes, I worked on genetic algorithms. Since I had my own computers, because I was really, really well supplied from Bell Labs, I was able to just run genetic algorithms. Because I had the math background I could see that it should work, but I also could see that it wasn’t converging fast enough. It was clear that the machines of roughly 1986 were not sufficiently powerful to allow this to converge.
At the same time, I was in a seminar. We were looking at neural networks and how those all worked. People were starting to get this inkling that maybe those wouldn’t converge either.
Ledge: When you say converge, just for anybody who doesn’t know, what is the convergence or what is that concept that you’re talking about?
Sandra: Basically, what you want to know is that the thing is going to produce a model that’s going to produce a good answer all the time. What was happening was that, I was training these models but they were just producing garbage. The answers weren’t getting progressively better.
Ledge: That’s just the nature of raw compute power. That you just couldn’t do enough or powerful enough to get here?
Sandra: That’s exactly it. There just wasn’t enough cycles. I ran my genetic algorithms experiment for three weeks straight and it wouldn’t… It got a little better but it was clear to me, I was going to have to run this thing for years.
Ledge: Right. That was not the world of what, now we have like or something crazing like that?
Sandra: Exactly, so we couldn’t converge. Then strangely, as an undergrad I was at Northwestern. At Northwestern, they had a lot of people doing computational linguistics. I was involved in a lot of that.
I learned a lot about how to break down sentences and how to do grammars. That, of course, fed into a lot of my compiler work, because we did languages for that.
Ledge: Is that going to be what you call an?
Sandra: It wasn’t directly related, but let’s just say it didn’t hurt that I knew how to look at English. I did a lot of stuff around linguistics in that area as well.
Ledge: Is that the science that we call…?
Sandra: You still there, Ledge?
Ledge: I am. Can you hear me?
Sandra: Looks like I lost you. Okay. I see it’s at zero. Looks like we’re recording.
Ledge: Okay, so computational linguistics. G.
Sandra: I was in undergrad and it happened at Northwestern there was a lot of work around computational linguistics. What I did was I learned a lot about how to break down sentences and find nouns and so on. That was very helpful because that was the direction that a lot of AI was going in at the time. But then there was the winter and so we didn’t do much more of that.
Ledge: Is that what became…? Now we hear about NLP a lot. Is that…?
Sandra: That’s right. That was some of the early work in NLP. We had isatries and parses. That’s right.
Ledge: All right. So all that starts to converge but there’s this winter. Did you stop working on it? Maybe you should explain what the AI winter was, because I don’t want everybody to think all Terminator here.
Sandra: Yeah. What happened was, I can’t remember the professor’s name, but somebody declared that AI would never work, and everybody else decided to believe that person, and so AI work stopped. I don’t remember exactly what year it was, but let’s call it ’88, ’89, ’90, somewhere in there.
Ledge: Was this general AI will never work, or narrow, or was there any conception there?
Sandra: I don’t recall the exact statement because I wasn’t really there for that part. It was just everybody agreed to stop looking at AI and machine learning right about then.
Ledge: Okay. Did you go do something else?
Sandra: Well, actually, what happened was I went to grad school and Bell Labs paid for me and then they brought me back. They actually had me building routers. Think Cisco routers, except it was Bell Labs routers.
I went on to learn a great deal about software engineering and how to run development organizations and research organizations and do technology transfer, do good research. I developed a very deep general background in software engineering.
In fact, I was there for some of the earliest work in agile programming. I was figuring that out before agile existed. My friends were teaching me extreme programming before extreme programming existed. All that stuff was being developed. I had this great foundation for when this opportunity presented itself in the not so distant past. It’s been several years, but yeah.
Ledge: How did all that…? It’s different kind of convergence. So much has changed over that time. I can completely imagine, just my own journey, that the fundamentals build up over time. You’re prepared, because of all that work, to solve a future problem. Your own brain’s pattern matching and all that, it just comes from experience.
The technology, like the stacks and all the stuff, do you have in the back of your head, “Oh, good. Now we have enough compute power to do this? We can do things now?”
Sandra: Yeah, exactly. In fact, actually, the way I was taught to do computers, which is not the way it seems that they’re teaching people now in computer science is that I was taught to look at it far more abstractly. I don’t look at languages. I look at language capability and what they allow easily and what they don’t allow easily, basically at the language level.
Ledge: Kind of like right tool for the job, kind of idea.
Sandra: Exactly, right. To me, the changes that we’ve undergone are simply differences in emphasis that people have chosen over time and what kinds of things they valued.
We valued different things 20 years ago than what we value now. Twenty years ago we cared about how much memory we used, and we wanted to optimize CPU performance above programmer performance. Now, we care much more about programmer performance so we have much more powerful statements.
Things like Python. Python wasn’t really possible a long time. It just would have been too inefficient from a compute point of view. It’d been a nice prototyping tool.
Ledge: Right. We can continue to add layers of abstraction because we have the ability to continue to abstract with greater technological capabilities and compute power.
Sandra: That’s right. We didn’t have good search back in the day. Now, we have very, very powerful search. Google just couldn’t exist because there wasn’t enough compute and the reliability of the devices wasn’t enough. Even now, it can be very stressful for Google to keep running, because disks fail at a pretty high rate when you start to multiply those. They’re constantly having to create infrastructure which is resilient to those failures.
We had to build a lot of infrastructure in order to get to the point we are now. Stack Overflow didn’t exist and now it’s fantastic. It’s so much more productive with Stack Overflow.
Ledge: Right. The idea that you can just multiply your access to answers that you don’t… I think you’re right. It’s harkens back to that idea that there is nothing new – there’s just new implementations of the same.
What you just said is very anthropological. Looking back through, it’s just different views on the same thing, with a little more icing on top every couple of years.
Sandra: Yeah. There’s more icing and more capabilities. Like what you’re saying with abstraction is exactly the right idea. What we’re doing is we’re encapsulating more and more capability and we’re providing just single interface points to those. As we produce more and more capability, we’re able to build things up.
I remember when I hired a guy to GLYNT, and when he came on he was shocked at the rate of productivity we had. What he was seeing was that we were producing, in terms of functionality, as much as a 100-person team 10 years ago would have been able to produce. We’re doing it with four or five people. It’s just because so much stuff had been encapsulated, we were able to start deploying things at big chunk levels and the interfaces work.
Ledge: That’s interesting because it gets to the organizational abstraction. One thing I’ve noticed is that, if you dig into architectural patterns and just design paradigms, the organization starts to resemble the code and/or vice versa, in the same way that the sort of dog resembles the owner and vice versa.
Ledge: We build organizational constructs because we needed to break code up, and then we call that microservices, so that we could get more stuff done. But the pendulum swings the other way too. Then you start to re-conglomerize your stuff back into larger chunks.
I imagine you’ve seen that in practice over and over again.
Sandra: Well, actually, microservices are interesting because we used to have those a long, long time ago, and we called them subsystems. We said that they had defined interfaces and everything else was encapsulated.
I think people now who are producing these microservices haven’t seen them before, but for myself it wasn’t a big deal, because I looked at it and I went, “Microservice.”
We started getting bigger and bigger problems in GLYNT. GLYNT is a huge system. It’s very complex. I said, “When I used to design really large systems, I would have made this a subsystem and this a subsystem.” They’re like, “Oh, well…” We had microservices but I hadn’t completely appreciated them. They’re like, “Well, we’ll just use our microservice.” I’m like, “That’s perfect.” “We’ll put an API on it.” I’m like, “That’s exactly what I want.”
We didn’t design the engineering to match the microservices, because we didn’t need to. We stepped it forward, because we had experience, at least at the managerial level, with this kind of paradigm. I didn’t feel compelled to… It wasn’t such a hard leap for me to understand what was happening. We actually didn’t structure the assignments according to microservices.
Ledge: You could probably get away with that too, because of the excellent design choices. That you didn’t have 50 engineers, you had five.
Ledge: You didn’t have a team for every service.
Sandra: Exactly. I didn’t need to, because I’m able to… I think part of the reason that organizations may be building up these giant structures around microservices and isolating people – which isn’t a bad breakdown depending – but if you’ve seen this before, it’s not so hard to climb that curve. You go, “Okay, I’ve done this before. There are other ways I can do assignments besides; okay, you’re siloed here, you’re siloed there.
Ledge: That’s a good point. Now, you have this learning and it’s super valuable. A lot of people are getting into the field right now, just simply never done it before.
What are those best practices? How did you make choices that were good and prevent yourself from falling into the same hole that you knew you could fall into because you did it 20 years ago?
Sandra: Remember, I was very fortunate because I came after a long stream of people had been designing software for 20 years before me. When I started, I was being indoctrinated into how to build things, so I avoided a whole ton of problems.
Ledge: Now, you’re the giant that everybody else has to stand on the shoulders, so please help out here.
Sandra: That’s right. It’s the same thing that we learned in school. What you want to do is have isolation of functionality. You want to have defined interfaces. You want things to not break the paradigm, that’s…
Ledge: Respect the black box.
Sandra: Exactly. That hasn’t changed. It hasn’t changed along…
Ledge: Is it not done rigorously enough? You hear disaster stories all the time. As software gets more and more complex, it gets more and more rife with just absolute collapses.
Sandra: Well, again, that’s been going on for a very long time. It will collapse if you don’t respect the abstractions. I think that a good respect for abstraction is really critical for a successful software solution.
Ledge: That’s a perfect tweet right there. We’re going to use that. Is that the most critical thing, respect for abstraction? Can you point to that and say, hey, that’s the key?
Sandra: Well, that certainly helps
The other thing is not reinventing the wheel. Which is very cliché but we use a lot of software that already exist. Those companies that are deploying software for backup restore, business continuity, continuous integration, all those items; they have – out of necessity, because of their cooperation – they have a defined interface. They, hopefully, have fixed all the internals so that it actually works.
By using these piece parts at very high levels of granularity, so you get; kaboom, an OCR engine drops in; kaboom, an API structure drops in. That allows you to get a huge amount of leverage.
Ledge: Right. I think of those as macro service. It’s really the same thing. You’re just completely abstracting a major component of your system, upstream or downstream, to another provider.
Sandra: That’s right.
Ledge: Of course, which also comes with now we’re all hosted on Amazon. God forbid anything happens to Amazon and it all goes down. There seems to be an acceptable level of upstream and downstream risk that goes into that design calculation.
Sandra: That’s right. What I try to do is outsource all of the functionality that some other company provides. I just want to use their stuff. The only thing I want to make are the few things that provide value to my customers. That make me distinguished.
Even doing that and being very religious about it, we still have a huge, huge amount of software to build in machine learning. It’s become enterprise software again, in terms of the amount of infrastructure that you have to have in order to field something that a corporation can use.
Ledge: Right. That’s a whole… Geez, we can go on for hours about that.
Ledge: Let me ask, as we have to run out of time here. Think about, if you would, what, to your mind… I say this,
We’re in the business of evaluating and vetting and certifying excellent engineers. We do that well and there seems to a very high success rate, so we’re proud of that. Yet, I’m never too arrogant to say, okay, every guest, I want to know what are your heuristics to measure just an A + engineer that you want to work with on your software.
Sandra: This shouldn’t be surprising. I look for their ability to abstract. I ask them questions about how they solve problems. I try to find out how well they thought about things from an abstract point of view.
Now, that doesn’t mean I don’t hire people who can’t abstract, because there are some people who just know the tools really well. DevOps, for example, not a high abstraction job. That’s a, know all the stuff, be good with Stack Overflow and understand…
Ledge: Food chains.
Sandra: Yeah, exactly. Understand how things fit together. Know where on the abstraction level that job is, and then figure out how fast that person can learn.
Do you have a problem where you can just hire somebody, they already know it. The entire problem is well defined? In that case, great. Just hire the person with the key buzzwords who knows that. Don’t worry about learning. Don’t worry about abstraction. Just say, “Here’s your job. Do it.”
Most jobs aren’t that way and most jobs don’t stay that way.
Ledge: Understood. Excellent. Before we wrap, I have my fun lightning round. Are you ready?
Ledge: Star Wars or Star Trek?
Sandra: Star Wars.
Ledge: Good, good. We’re on the same team there. What are you reading right now?
Sandra: I can’t remember his name. I’m sorry.
Sandra: All right. We’ll have to look that up.
Ledge: What can you absolutely not live without?
Ledge: What is the last thing that you Googled for work?
Sandra: Aurora on AWS.
Ledge: Nice. Okay. I don’t know if you’re a fan of The Office, but there’s a classic episode of The Office where Jim, he’s messing with Dwight, who’s the office heel, and he’s sending him faxes from future Dwight. He’s messing with him. He’s saying, “The coffee is going to poison you,” and things like that and Dwight has.
This got me thinking, what if I gave you a piece of paper, one piece of paper and a Sharpie, what would you fax to yourself 10 years ago?
Sandra: A piece of paper and a Sharpie 10 years, so that’s 2008. Buy the dip.
Ledge: Buy the dip. Well played. We’ll include a stock chart for anybody that doesn’t get it. Excellent.
Well Sandra, this was fun. Thank you so much. I love the insights, love the learning. We’ll have to come back and do even some more stuff on enterprise next time.
Sandra: That sounds great. Well, thanks. Thank you for having me.