Python 2: Using regex to pull out whole lines from text file with substring from another -


i have noob question. using python 2.7.6 on linux system.

what trying achieve use specific numbers in list, correspond last number in database text file, pull out whole line in database text file , print (going write line text file later).

code trying use:

reg = re.compile(r'(\d+)$')  line in "text file database":     if list_line in reg.findall(line):         print line 

what have found can input string like

list_line = "9" 

and output whole line of corresponding database entry fine. trying use list_line input strings 1 one in loop doesn't work.

can please me out or direct me relevant source?

appendix:

the text file database text file contains data similar these:

gnl acep_1.0 acep10001-pa 1 gnl acep_1.0 acep10002-pa 2 gnl acep_1.0 acep10003-pa 3 gnl acep_1.0 acep10004-pa 4 gnl acep_1.0 acep10005-pa 5 gnl acep_1.0 acep10006-pa 7 gnl acep_1.0 acep10007-pa 6 gnl acep_1.0 acep10008-pa 8 gnl acep_1.0 acep10009-pa 9 gnl acep_1.0 acep10010-pa 10 

the search text file list_line looks similar this:

2 5 4 6 

updated original code:

    #import extensions     import linecache      import re      #set re.compiler parameters     reg = re.compile(r'(\d+)$')      #designate , open list file     in_list = raw_input("list input: ")      open_list = open(in_list, "r")      #count lines in list file     total_lines = sum(1 line in open_list)      print total_lines      #open out file in write mode     outfile = raw_input("output: ")      open_outfile = open(outfile, "w")      #designate db string     db = raw_input("db input: ")      open_db = open(db, "r")      read_db = open_db.read()       split_db = read_db.splitlines()      print split_db            #set line_number value 0     line_number = 0      #count through line numbers , print line     while line_number < total_lines:         line_number = line_number + 1         print line_number          list_line = linecache.getline(in_list, line_number)         print list_line          line in split_db:             if list_line in reg.findall(line) :                 print line       #close files     open_list.close()      open_outfile.close()      open_db.close()  

short version: for loop going through "database" file once, looking corresponding text , stopping. if have multiple lines want pull out, in list_line file, you'll end pulling out single line.

also, way you're looking line number isn't great idea. happens if you're looking line 5, second line happens have digit 5 somewhere in data? e.g., if second line looks like:

gnl acep_1.0 acep15202-pa 2 

then searching "5" return line instead of 1 intended. instead, since know line number going last number on line, should take advantage of python's str.split() function (which splits string on spaces, , returns last item of , fact can use -1 list index last item of list, so:

def get_one_line(line_number_string):     open("database_file.txt", "r") datafile: # open file reading         line in datafile:  # how 1 line @ time in python             items = line.rstrip().split()             if items[-1] == line_number_string:                 return line 

one thing haven't talked rstrip() function. when iterate on file in python, each line as-is, newline characters still intact. when print later, you'll using print -- print prints newline character @ end of give it. unless use rstrip() you'll end 2 newlines characters instead of one, resulting in blank line between every line of output.

the other thing you're not familiar there with statement. without going detail, ensures database file closed when return line statement executed. details of how with works interesting reading knows lot python, python newbie won't want dive yet. remember when open file, try use with open("filename") some_variable: , python right thing™.

okay. have get_one_line() function, can use this:

with open("list_line.txt", "r") line_number_file:     line in line_number_file:         line_number_string = line.rstrip() # don't want newline character         database_line = get_one_line(line_number_string)         print database_line # or whatever need 

note: if you're using python 3, replace print line print(line): in python 3, print statement became function.

there's more code (for example, opening database file every single time line kind of inefficient -- reading whole thing memory once , looking data afterwards better). enough started with, , if database file small, time you'd lose worrying efficiency far more time you'd lose doing simple-but-slower way.

so see if helps you, come , ask more questions if there's don't understand or isn't working.


Comments

Popular posts from this blog

resizing Telegram inline keyboard -

command line - How can a Python program background itself? -

php - "cURL error 28: Resolving timed out" on Wordpress on Azure App Service on Linux -