hexagon logo

Converting code to Python 3

Ho folks,
I'm about to check our code for future usage with ADAMS 2018 which uses Python 3 . (It's working as is in Python 2.x)
 
This code that's used to do a dos2unix on Windoze machines throws errors that I can't resolve:
 
import os, sys, string
import re
 
# In_File = File to be converted
In_File = 'test.txt'
# -----------------------------------
# Convert to unix
# -----------------------------------
data = open(In_File, "rb").read()
newdata = re.sub("\r\n", "\n", data)
if newdata != data:
   print('   ' + In_File + " \tis converted to unix format ...")
   f = open(In_File, "wb")
   f.write(newdata)
   f.close()
 
Any proposals for replacing
re.sub("\r\n", "\n", data)
that may work ?
 
  • Solved that one.
    Needed to be re.sub(b"\r\n", b"\n", data)
     
    My next task:
    Accessing the acar.log with
    f=open(FileName)
    lines=f.readlines()
     
    throws
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "<string>", line 25, in <module>
    File "/appl/mdi/2018_RH6/python/linux64/lib/python3.5/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 6951: ordinal not in range(128)
    ! &> integer = (eval(run_python_code("exec(open('grep_file.py').read())")))
     
  • Hello Martin,
     
    Problem is that beta license today ended. I do know if you have longer license, or somebody other, i tried to run your code in 2018 but for me beta and python 3 is over for now.
     
    Regards Jozef
  • Using bytestrings for ascii text isn't a good idea, you should leave it in unicode. Just read and write the file using the default encoding, but with an explicit "newline" on the write:
     
    with open(in_file, "r") as input:
    data = input.read()
    with open(out_file, "w", newline="\n") as output:
    output.write(data)
     
    (Where the hell is the BBcode-like [code] block on this crappy forum?)
  • Eric, wouldn't the
    with open(out_file, "w", newline="\n")
    be sensitive to DOS/UNIX ascii ? If yes that's not feasible as I need something general working on both platforms with both kind of ascii types.
     
    Generally spoken I'm not really happy with what PY3 demands.
    To me it seems like the nerds have taken over and you have to program somehow "cleaner" like you used before.
    For example in "old" code I have things like this:
    import string
    a = "this is a test"
    b = string.replace(a, "is a test", "sucks")
    print b
     
    Guess what? It doesn't work anymore.
    In this case it's a simple fix to do
    b = a.replace("is a test", "sucks")
    print(b)
     
    But imagine you have lots of those in legacy code you inherited to maintain.
    You need to first change all of it (which already is a PITA) and then test ALL the code under different combinations with 2017.2 and 2018 ADAMS on Linux and Windoze.
    Something I'm as fond of as seeing my dentist to get my wisdom teeth removed.
     
    @Jozef: ADAMS 2018 is using Python 3.5.1 on Linux and this can be downloaded and installed separately:
     
    If you test code without the MSC python distribution, you may use this addition which Eric Fahlgren proposed long ago:
     
    if __name__ != '__main__':
      import aview_main
    else:
      class aview_main:
         def evaluate_exp(exp): print(exp); return exp
         def execute_cmd (cmd): print(cmd)
         evaluate_exp = staticmethod(evaluate_exp)
         execute_cmd = staticmethod(execute_cmd)
     
    It's kind of faking an aview_main module and just prints what you'd send to ADAMS. This let's you still test a lot of stuff outside ADAMS.
     
  • Martin,
     
    The 'open' function in Py3 is fully aware of the native context, so it will "do the right thing" on any given platform. You can force it on write, though, as I did with the example, to write Unix eol (linefeed) irrespective of platform. You could just as easily force DOS with "\r\n", or you could leave off the newline parameter and just let it write native ones.
     
    Yes, there are a lot of little things changed in 3.x, but they are pretty easily dealt with and the overall improvement to the language is huge (especially for anyone who ever had to deal with non-ASCII text, say, maybe something with umlauts or other diacritics in it). In Py2, the Unicode handling was spotty and inconsistent, which ended up corrupting things and forcing you into various tricks that didn't always work. Py3 fixes all that by putting the encode/decode firewall between strings and bytes. When they made that fix, they had the opportunity to clean things up, so they took it, no more string module methods, they are all on string now.
     
    > alias 2to3  '"t:/Python27/Tools/Scripts/2to3.py" --fix=all --fix=idioms --nofix=basestring \!:*'
     
    Overall, Py3 is greatly improved compared to Py2. In 3.4 (the first "useable" version of 3 in my opinion), you were about at status quo with 2.7, but now with 3.6, it's way faster, has a smaller memory footprint, much richer in both the libraries and language constructs, and has all sorts of nice little tweaks to make your code much more readable. Here are a couple:
     
    Python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 18:11:49)
    >>> x = 1_000_000
    >>> y = f"The value of x is {x}"
    >>> y
    'The value of x is 1000000'