Pandas从CSV行读取_csv多个IP

fnvucqvd  于 2023-11-14  发布在  其他
关注(0)|答案(1)|浏览(117)

我是Python新手,我想在CSV文件中对IP执行一些反向DNS查找。CSV有一个'ip'列,地址格式如下:
example of source input data
我的python脚本只适用于单个IP,但我不确定如何读取具有多个IP地址的行。

#pip install pandas
#pip install dnspython
#pip install xlrd, pip install xlsxwriter

### Import needed libraries ###
import pandas as pd
import time 

from pandas.io.excel import ExcelWriter
from dns import resolver,reversename

### Time variable ###
startTime = time.time()

#Custom Delimiter
custom_delimiter = ','
logs = pd.read_csv('C:\\Users\\user\\Desktop\\rdnslookup\\input_file\\logs.csv', sep=custom_delimiter, quotechar=',', engine='python', converters={'ip': lambda x:x.strip('["]')}, on_bad_lines = "skip") 

#Diagnostic print statement
print(logs.head)

# Create new Dataframe that removed duplicate IP addresses
logs_filtered = logs.drop_duplicates(['ip']).copy() 

### Perform DNS lookup on deduplicated IPs ###
def reverseDns(ip):
  try: 
    return str(resolver.query(reversename.from_address(ip), 'PTR')[0])
  except: 
    return 'N/A'

### Create DNS column with the reverse IP DNS result ###
logs_filtered['dns'] = logs_filtered['ip'].apply(reverseDns)

### Merge DNS column to full logs matching IP ###
logs_filtered = logs.merge(logs_filtered[['ip','dns']], how='left', on=['ip'])

### Output IP addresses to CSV with DNS lookups ###
writer = ExcelWriter('C:\\Users\\user\\Desktop\\rdnslookup\\output_file\\validated_logs.xlsx', engine='xlsxwriter',)
logs_filtered.to_excel(writer,'Sheet1', index=False)
writer.close()

字符串
我试着把引号改为““,但这并没有从多个地址的行中删除引号,感谢任何帮助。谢谢

hc8w905p

hc8w905p1#

你正在遭受“当你只有一把锤子时,整个世界开始看起来像钉子”综合症。Pandas很漂亮,但是当你浪费时间试图将你的问题压缩成适合Pandas的形式时,这是没有效率的。你的问题更容易通过阅读它的本质来解决--一个JSON记录列表。
我没有安装dns模块,所以我伪造了它。

import pandas as pd
import time 
import json

#from dns import resolver,reversename

### Time variable ###
startTime = time.time()

rows = []
for row in open('logs.csv'):
    if row[0] != '[':
        continue
    data = json.loads(row)
    rows.extend(data)

# Remove duplicates.

rows = list(set(rows))

### Perform DNS lookup on deduplicated IPs ###

def reverseDns(ip):
  try: 
    return ip[::-1]
#    return str(resolver.query(reversename.from_address(ip), 'PTR')[0])
  except: 
    return 'N/A'

dns = [reverseDns(row) for row in rows]

logs = pd.DataFrame( {'ip': rows, 'dns': dns})
print(logs)

字符串
输出量:

ip          dns
0  192.168.1.3  3.1.861.291
1  192.168.1.1  1.1.861.291
2  192.168.1.2  2.1.861.291

相关问题