I am working in a Jupyter Notebook with PySpark v2.3.4 which runs on Java 8, Python 3.6 (with py4j==0.10.7), and Scala 2.11, and I have a Scala case class that takes in a scala.util.matching.Regex (scala doc) as an arg like so:
case class myClass(myString: String, myRegex: Regex)
I would like to construct an object from myClass but I can't seem to figure out how to construct a scala.util.matching.Regex object in a Python / PySpark environment. Below are a couple of my attempts/docs I've followed to create a Scala regex where sc is my SparkContext.
sc._jvm.scala.util.matching.Regex("""(S|s)cala""")- Error:
Constructor scala.util.matching.Regex([class java.lang.String]) does not exist - This error message dumbfounds me because the Scala 2.11 docs clearly state that its constructor takes in a
java.lang.String.
- Error:
sc._jvm.scala.util.matching.Regex("(S|s)cala")- Same error as above
sc._jvm.scala.util.matching.Regex(r"(S|s)cala")- Same error as above
sc._jvm.scala.util.matching.Regex("(S|s)cala".r)(the way they do it in Scala)- Error: Python string does not have attribute "r"
sc._jvm.java.util.regex.Pattern.compile("(S|s)cala")successfully creates a Java regex pattern -- and the scala doc clearly states that the Scala regex delegates to the Java regex package...
Any help/advice would be much appreciated! Thanks in advance!