Can I extract specific text from the middle of a string?

  • 1
  • 1
  • Question
  • Updated 2 years ago
  • Answered

I have a field that contains serial numbers within blocks of sentence text. I want to extract the serial numbers and place them into their own field. I cannot control the input into these fields or mandate that a new field be filled out.

For example, the field may contain: 

"Picked up S5442 at Boxies on Golden" or "S5931 inbound to home"

What I'd like my formula field to display is "S5442" and "S5931" respectively.

The serial numbers will always start with "S5" and be 5 characters long total.

What would be ideal is something similar to excel's 'Find' formulas - for example: MID(A1, FIND("S5", A1,), 5) - However I can't seem to find anything analogous to this in Quickbase.

Photo of Bob

Bob

  • 0 Points

Posted 3 years ago

  • 1
  • 1
Hi Bob,

Let's say your field was called "Description".

If you use this formula:

If(Contains([Description],"S5"),
"S5"&Left(Right([Description],"S5"),3),"")


It will find the first instance of "S5" in your field then combine S5 with the first 3 characters after S5 in the string of text. It is sort of a quick and dirty solution, but I think it should probably work. If S5 is not detected, the formula text field will remain blank.

Let me know if that works for you. Good Luck!

Eric
Photo of Bob

Bob

  • 0 Points
Hey Eric,

Thanks for the suggestion. Clever use of Right and Left together like that - I was stuck on using Mid.

The formula you provided works sometimes, but not all the time. It looks like the formula starts backwards at the end of the string and then pulls text whenever it comes across /either/ a "S" or a "5" rather than the two together. For example, here's part of a sentence I have:

"S5775: To return blah blah blah blah
Should use 0, 2, 5, or 8"

The formula outputs: "R4, o" so it seems the whole string "S5" in the left/right part of the formula isn't being respected and it is simply pulling everything after it finds "5". The same thing occurs with a sentence that contains a capital "S".
Photo of Bob

Bob

  • 0 Points
Looks like this is due to the way Left and Right work - per the documentation:
"The second argument, d, is a text value containing all the possible delimiter characters."

So it seems each character put in the formula acts as a delimiter, rather than the whole string. Is it possible for a string to act as a delimiter rather than the individual characters?
Photo of QuickBaseCoach App Dev./Training

QuickBaseCoach App Dev./Training, Champion

  • 67,780 Points 50k badge 2x thumb
This should work. Admittedly some brute force involved here, but the formula does have some repetitive elegance to  it at least.

var text SOne   = [Pickup Description];

var text STwo   = NotLeft($SOne,"S");

var text SThree = NotLeft($STwo,"S");

var text SFour   = NotLeft($SThree,"S");

var text SFive    = NotLeft($SFour,"S");

var text SSix     = NotLeft($SFive,"S");

var text SSeven= NotLeft($SSix,"S");

var text SEight   = NotLeft($SSeven,"S");

var text SNine= NotLeft($SEight,"S");

var text STen   = NotLeft($SNine,"S");



If(

Begins($SOne,"5"), "S" & Left($SOne,4),

Begins($STwo,"5"), "S" & Left($STwo,4),

Begins($SThree,"5"), "S" & Left($SThree,4),

Begins($SFour,"5"), "S" & Left($SFour,4),

Begins($SFive,"5"), "S" & Left($SFive,4),

Begins($SSix,"5"), "S" & Left($SSix,4),

Begins($SSeven,"5"), "S" & Left($SSeven,4),

Begins($SEight,"5"), "S" & Left($SEight,4),

Begins($SNine,"5"), "S" & Left($SNine,4),

Begins($STen,"5"), "S" & Left($STen,4))
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
>The formula you provided works sometimes, but not all the time. 

I have made this same mistake thinking that Right() and Left() will parse on multi-character strings rather than any one of the delimiter characters. This is a common mistake to programmers because most languages support this trivial text parsing feature as well as more advanced string manipulation methods and regular expressions.

I hesitate to provide this answer because this is trivial to do in JavaScript and the native QuickBase solution is a hack that is both cumbersome, promotes bad practices and has an upper limit to the number of tokens it could parse out.

But here is my solution that will attempt to parse the first seven S5* tokens:

List("\n",
  If(Begins(Part([f], 2, "S"), "5"), "S" & Left(Part([f], 2, "S"), 4)),
  If(Begins(Part([f], 3, "S"), "5"), "S" & Left(Part([f], 3, "S"), 4)),
  If(Begins(Part([f], 4, "S"), "5"), "S" & Left(Part([f], 4, "S"), 4)),
  If(Begins(Part([f], 5, "S"), "5"), "S" & Left(Part([f], 5, "S"), 4)),
  If(Begins(Part([f], 6, "S"), "5"), "S" & Left(Part([f], 6, "S"), 4)),
  If(Begins(Part([f], 7, "S"), "5"), "S" & Left(Part([f], 7, "S"), 4))
)

You can increase the number of tokens it will attempt to parse by adding additional lines using the obvious pattern of statements. You can use this text as a test case to see what I mean:

I HAVE SASSAFRAS AND BOXES OF S5442, S5443, S5444, S5445, S5446, S5447, S5448, S5449, S5450, S5451, S5452


FWIW, if you want a JavaScript solution you can adopt what is presented in this Q&A:

https://quickbase-community.intuit.com/questions/1193611-how-to-globally-replace-string-in-text-fiel...
Photo of QuickBaseCoach App Dev./Training

QuickBaseCoach App Dev./Training, Champion

  • 67,780 Points 50k badge 2x thumb
Dan's suggestion is more elegant than mine in that it is shorter and also it will parse out multiple Part numbers, where mine will stop at the first Part # finds.
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
Thanks for the plug but the problem with this solution is that it is so inflexible. The first thing that is going to happen is a user is going to enter s5442 instead of S5442 and the formula will have to be patched.
Photo of QuickBaseCoach App Dev./Training

QuickBaseCoach App Dev./Training, Champion

  • 67,780 Points 50k badge 2x thumb
well, then this would work

List("\n",
  If(Begins(Part([f], 2, "S"), "5"), "S" & Left(Part([f], 2, "Ss"), 4)),
  If(Begins(Part([f], 3, "S"), "5"), "S" & Left(Part([f], 3, "Ss"), 4)),
  If(Begins(Part([f], 4, "S"), "5"), "S" & Left(Part([f], 4, "Ss"), 4)),
  If(Begins(Part([f], 5, "S"), "5"), "S" & Left(Part([f], 5, "Ss"), 4)),
  If(Begins(Part([f], 6, "S"), "5"), "S" & Left(Part([f], 6, "Ss"), 4)),
  If(Begins(Part([f], 7, "S"), "5"), "S" & Left(Part([f], 7, "Ss"), 4))
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
No it won't as you didn't include the lower case "s" among the delimiters of the Part() calls within the Begins(). But the bigger issue is that you will always be debugging and wrestling with the limitations of the formula language. In JavaScript you can parse any pattern whatsoever using regular expressions and compute any output you desire - effortlessly. One short regular expression /S5\d{3}/g will parse out all the S5* tokens:

var f = "I HAVE SASSAFRAS AND BOXES OF S5442, S5443, S5444, S5445, S5446, S5447, S5448, S5449, S5450, S5451, S5452";

f.match(/S5\d{3}/g).join("\n");

result:

"S5442
S5443
S5444
S5445
S5446
S5447
S5448
S5449
S5450
S5451
S5452"
Photo of QuickBaseCoach App Dev./Training

QuickBaseCoach App Dev./Training, Champion

  • 67,780 Points 50k badge 2x thumb
yes, quite right so this is now the best native solution

List("\n",
  If(Begins(Part([f], 2, "Ss"), "5"), "S" & Left(Part([f], 2, "Ss"), 4)),
  If(Begins(Part([f], 3, "Ss"), "5"), "S" & Left(Part([f], 3, "Ss"), 4)),
  If(Begins(Part([f], 4, "Ss"), "5"), "S" & Left(Part([f], 4, "Ss"), 4)),
  If(Begins(Part([f], 5, "Ss"), "5"), "S" & Left(Part([f], 5, "Ss"), 4)),
  If(Begins(Part([f], 6, "Ss"), "5"), "S" & Left(Part([f], 6, "Ss"), 4)),
  If(Begins(Part([f], 7, "Ss"), "5"), "S" & Left(Part([f], 7, "Ss"), 4))

I do agree that the one line of code does look powerful
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
But the "best" known QuickBsae formula fails on this test case:

I am driving down US54 to pick up a load of SASSAFRAS AND BOXES OF S5442

You simply cannot get the anywhere near the level of precision with the extremely limited set of text manipulation functions in QuickBase compared to using script and regular expressions.
Photo of Bob

Bob

  • 0 Points
Thanks a bunch guys - I realize it's not the most elegant solution when compared to script/code, but this is pretty much what I was looking for and it works.
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
B> ...  it's not the most elegant solution ...

Southbound US54 closed due to overturned truck spilling SASSAFRAS AND BOXES OF S5442, S5443, S5444, S5445, S5446, S5447, S5448, S5449, S5450, S5451, S5452 - Northbound US54 carrying <script> traffic unaffected.
Photo of Bob Jordan

Bob Jordan

  • 274 Points 250 badge 2x thumb
Hey Dan,
You simply cannot get the anywhere near the level of precision with the extremely limited set of text manipulation functions in QuickBase compared to using script and regular expressions

You mentioned that this could be done with script and regular expressions - can this be done within a quickbase formula field?

I've noticed that within a text formula field I can include "javascript($string..." and it will accept it, but any parameters of the javascript variable willl error out (ex: javascript($string.search("S5")) will result in an error at the .search part).
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 30,074 Points 20k badge 2x thumb
It isn't worth the trouble trying to put JavaScript into your formulas as you will have a difficult time debugging your formulas and adding a lot of character escaping which makes the formulas unreadable. It is best to get out of the formula language as quickly as possible and pass the least amount of field values to the script. When you try to do parsing like this exclusively in the formula language you have to limit yourself to some upper bound of instances you can parse because the formula language has no looping or iterative methods and this process makes for jumbo formulas. In script you have no such limitations or nuisances.

Anyways I have a demo in preparation for a similar question which I should finish shortly. I will reread your question and reply further when I get a chance.
(Edited)
Photo of Bob Jordan

Bob Jordan

  • 274 Points 250 badge 2x thumb
Hi,

Thanks Dan. Understood about the script in formulas - it does seem rather cumbersome. If I wanted to look into getting out of formulas though, where woudl you input the script? 

I work with imported sheets, so I'm attempting to pull the middle strings out from these imported fields. Is there a specific type of field and format I should use to make use of script?
Photo of Bob Jordan

Bob Jordan

  • 274 Points 250 badge 2x thumb
Thanks Dan - very helpful. I now have the IOL working with the alert popping up on my forms.

It looks like in order to make the .js target a specific field, I'd need to specify the field ID I'm trying to target, correct? I'm going off the example you provided here: https://community.quickbase.com/quickbase/topics/what-is-the-simplest-iol-example

So to get the middle string using this method, I'd do something like: 
$("#_fid_[field ID of field]").substring(1,4);

This code by itself doesn't seem to work though. 

I assume that I would need a separate module.js page and new IOL field for each custom js I'd like to include in the form, correct? For example, having one formula that finds a middle string and another that throws an alert would require two .js pages and two IOL fields on the form.