以指定格式填充缺失值- Python

7cwmlq89  于 2023-03-20  发布在  Python
关注(0)|答案(7)|浏览(118)

我遇到了一个问题,明确要求我不要使用numpy或panda。
问题:
给定一个包含数字和'_'(缺失值)符号的字符串,您必须按照说明替换'_'符号

Ex 1: _, _, _, 24 ==> 24/4, 24/4, 24/4, 24/4 i.e we. have distributed the 24 equally to all 4 places 

Ex 2: 40, _, _, _, 60 ==> (60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5,(60+40)/5 ==> 20, 20, 20, 20, 20 i.e. the sum of (60+40) is distributed qually to all 5 places

Ex 3: 80, _, _, _, _  ==> 80/5,80/5,80/5,80/5,80/5 ==> 16, 16, 16, 16, 16 i.e. the 80 is distributed qually to all 5 missing values that are right to it

Ex 4: _, _, 30, _, _, _, 50, _, _  
==> we will fill the missing values from left to right 
    a. first we will distribute the 30 to left two missing values (10, 10, 10, _, _, _, 50, _, _)
    b. now distribute the sum (10+50) missing values in between (10, 10, 12, 12, 12, 12, 12, _, _) 
    c. now we will distribute 12 to right side missing values (10, 10, 12, 12, 12, 12, 4, 4, 4)

对于具有逗号分隔值的给定字符串,其将具有两个缺失值数字,如ex:“,x,”,您需要填写缺失值Q:你的程序读取一个字符串,例如ex:“,x,”并返回填充序列Ex:

Input1: "_,_,_,24"
Output1: 6,6,6,6

Input2: "40,_,_,_,60"
Output2: 20,20,20,20,20

Input3: "80,_,_,_,_"
Output3: 16,16,16,16,16

Input4: "_,_,30,_,_,_,50,_,_"
Output4: 10,10,12,12,12,12,4,4,4

我试着用split函数来拆分一个列表中的字符串,然后我试着检查左边的空格,并计算这样的空格的个数,一旦我遇到一个非空格,我就用总数除这个数,也就是说(在数字之前遇到的空格数和数字本身),然后展开这些值并替换数字左边的空格
然后我检查两个数字之间的空格,然后应用相同的逻辑,之后对右边的空格做同样的操作。
然而,我在下面分享的代码抛出了各种各样的错误,而且我相信我在上面分享的逻辑中存在漏洞,因此希望了解解决这个问题的见解

def blanks(S):

  a= S.split()
  count = 0
  middle_store = 0
  #left
  for i in range(len(a)):
    if(a[i]=='_'):
      count = count+1  #find number of blanks to the left of a number
    else:
      for j in range(0,i+1):
        #if there are n blanks to the left of the number speard the number equal over n+1 spaces
        a[j] = str((int(a[i])/(count+1)))
        middle_store= i
    break  

  #blanks in the middle
  denominator =0
  flag = 0
  for k in len(middle_store+1,len(a)):
    if(a[k] !='_'):
      denominator = (k+1-middle_store)
      flag=k
    break

  for p in len(middle_store,flag+1):
    a[p] = str((int(a[p])/denominator))

  #blanks at the right 
  for q in len(flag,len(a)):
    a[q] = str((int(a[q])/(len(a)-flag+1)))

S=  "_,_,30,_,_,_,50,_,_"
print(blanks(S))
ybzsozfc

ybzsozfc1#

模块化解决方案

# takes an array x and two indices a,b. 
# Replaces all the _'s with (x[a]+x[b])/(b-a+1)
def fun(x, a, b):
    if a == -1:
        v = float(x[b])/(b+1)
        for i in range(a+1,b+1):
            x[i] = v
    elif b == -1:
        v = float(x[a])/(len(x)-a)
        for i in range(a, len(x)):
            x[i] = v
    else:
        v = (float(x[a])+float(x[b]))/(b-a+1)
        for i in range(a,b+1):
            x[i] = v
    return x

def replace(text):
    # Create array from the string
    x = text.replace(" ","").split(",")
    # Get all the pairs of indices having number
    y = [i for i, v in enumerate(x) if v != '_']
    # Starting with _ ?
    if y[0] != 0:
        y = [-1] + y
    # Ending with _ ?
    if y[-1] != len(x)-1:
        y = y + [-1]    
    # run over all the pairs
    for (a, b) in zip(y[:-1], y[1:]): 
        fun(x,a,b)          
    return x

# Test cases
tests = [
    "_,_,_,24",
    "40,_,_,_,60",
    "80,_,_,_,_",
     "_,_,30,_,_,_,50,_,_"]

for i in tests:
    print (replace(i))
5uzkadbs

5uzkadbs2#

首先,你应该在split方法中指定一个分隔符作为参数,默认情况下,分隔符按空格分隔。
所以"_,_,x,_,_,y,_".split()会得到['_,_,x,_,_,y,_']
"_,_,x,_,_,y,_".split(',')将给予['_', '_', 'x', '_', '_', 'y', '_']
其次,对于“middle”和“right”循环(对于right),需要将len替换为range
由于除法的原因,最好使用float而不是int
因为要用它来做除法,所以最好将分母初始化为1。
在最后一个循环中,a[q] = str((int(a[q])/(len(a)-flag+1)))(与a[p]相同)应该返回一个错误,因为[q]是“_"。您需要使用一个变量来保存a[flag]的值。
每个break都应该在else或if语句中,否则,您将只传递循环一次。
最后,为了提高复杂性,您可以从j循环中退出middle_store赋值,以避免每次都赋值。
TL;DR:试试这个:

def blanks(S):
    a = S.split(',')
    count = 0
    middle_store = 0
    # left
    for i in range(len(a)):
        if a[i] == '_':
            count = count + 1  # find number of blanks to the left of a number
        else:
            for j in range(i + 1):
                # if there are n blanks to the left of the number speard the number equal over n+1 spaces
                a[j] = str((float(a[i]) / (count + 1)))
            middle_store = i
            middle_store_value = float(a[i])
            break

        # blanks in the middle
    denominator = 1
    flag = 0
    for k in range(middle_store + 1, len(a)):
        if a[k] != '_':
            denominator = (k + 1 - middle_store)
            flag = k
            break
    flag_value = float(a[flag])
    for p in range(middle_store, flag + 1):
        a[p] = str((middle_store_value+flag_value) / denominator)

    # blanks at the right
    last_value = float(a[flag])
    for q in range(flag, len(a)):
        a[q] = str(last_value / (len(a) - flag))

    return a

S=  "_,_,30,_,_,_,50,_,_"
print(blanks(S))

附言:你甚至试过解决错误吗?或者你只是等着别人来解决你的数学问题?

1tuwyuhd

1tuwyuhd3#

针对所讨论的问题的代码也可以通过以下方式完成,尽管代码没有优化和简化,但它是从不同的Angular 编写的:

import re 
def curve_smoothing(S): 

    pattern = '\d+'
    ls_num=re.findall(pattern, S)   # list of numeral present in string
    pattern = '\d+'
    spaces = re.split(pattern, S)  # split string to seperate '_' spaces

    if len(spaces[0])==0 and len(ls_num)==1:
        Space_num=len(re.findall('_',  S))
        sums=int(ls_num[0])
        repl_int=round(sums/(Space_num+1))
        S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
        S=re.sub('_', str(repl_int),S, Space_num)
        return S

    elif len(spaces[0])==0 and len(ls_num)>1:
        for i in range(1,len(spaces)):
            if i==1:
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(str(ls_num[i-1]), str(repl_int),S)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif i<len(ls_num):
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif len(spaces[-1])!=0:
                Space_num=len(re.findall('_',  spaces[i]))
                repl_int=round(ls_num[(i-1)]/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
        return S

    else:
        for i in range(len(spaces)):
            if i==0:
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[(i)])
                repl_int=round(sums/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S, 1)
                ls_num[i]=repl_int
            elif i>=1 and i<len(ls_num):
                Space_num=len(re.findall('_',  spaces[i]))
                sums=int(ls_num[i-1])+int(ls_num[(i)])
                repl_int=round(sums/(Space_num+2))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
                S=re.sub(str(ls_num[i]), str(repl_int),S,1)
                ls_num[i]=repl_int
            elif len(spaces[-1])!=0:
                Space_num=len(re.findall('_',  spaces[i]))
                repl_int=round(ls_num[(i-1)]/(Space_num+1))
                S=re.sub(r'(\d{2})(,_)', r'{}\2'.format(str(repl_int)) , S, 1)
                S=re.sub('_', str(repl_int),S, Space_num)
        return S

S1="_,_,_,24"
S2="40,_,_,_,60"
S3=  "80,_,_,_,_"
S4="_,_,30,_,_,_,50,_,_"
S5="10_,_,30,_,_,_,50,_,_"
S6="_,_,30,_,_,_,50,_,_20"
S7="10_,_,30,_,_,_,50,_,_20"

print(curve_smoothing(S1))
print(curve_smoothing(S2))
print(curve_smoothing(S3))
print(curve_smoothing(S4))
print(curve_smoothing(S5))
print(curve_smoothing(S6))
print(curve_smoothing(S7))
wqsoz72f

wqsoz72f4#

# _, _, 30, _, _, _, 50, _, _ 
def replace(string):
    lst=string.split(',')
    for i in range(len(lst)):
        if lst[i].isdigit():
            for j in range(i+1):
                lst[j]=int(lst[i])//(i+1)
            new_index=i
            new_value=int(lst[i])
            break
    for i in range(new_index+1,len(lst)):
        if lst[i].isdigit():
            temp=(new_value+int(lst[i]))//(i-new_index+1)
            for j in range(new_index,i+1):
                lst[j]=temp
            new_index=i
            new_value=int(lst[i])
    try:
        for i in range(new_index+1,len(lst)):
            if not(lst[i].isdigit()):
                count=lst.count('_')
                break
        temp1=new_value//(count+1)
        for i in range(new_index,len(lst)):
            lst[i]=temp1
    except:
        pass
    return lst
h4cxqtbf

h4cxqtbf5#

def replace(string):
lst=string.split(',')
if lst[0].strip().isdigit():
    index0=0
    while True:
        index1=index0
        value1=int(lst[index0].strip())
        index2=index1
        for i in range((index1+1),len(lst)):
            if lst[i].strip().isdigit():
                index2=i
                break
        value2=0
        if index2>index1:
            value2=int(lst[index2].strip())
        else:
            index2=len(lst)-1
        value=str(int((value1+value2)/((index2-index1)+1)))
        for i in range(index1,index2+1):
            lst[i]=value
        index0=index2

        if index0>=(len(lst)-1):
            break

else:
    index0=0
    while True:
        index1=index0
        value1=0
        if lst[index0].strip().isdigit():
            value1=int(lst[index0].strip())
        index2=index1
        for i in range((index1+1),len(lst)):
            if lst[i].strip().isdigit():
                index2=i
                break
        value2=0
        if index2>index1:
            value2=int(lst[index2].strip())
        else:
            index2=len(lst)-1
        value=str(int((value1+value2)/((index2-index1)+1)))
        for i in range(index1,index2+1):
            lst[i]=value
        index0=index2

        if index0>=(len(lst)-1):
            break

return lst   

string = "20,_,_,30, _, _,10,_,_,_,_,110"
replace(string)
p4rjhz4m

p4rjhz4m6#

# write your python code here
# you can take the above example as sample input for your program to test
# it should work for any general input try not to hard code for only given input strings
#run your code in the function for each of the inputs mentioned above and make sure that you get the same results
def appendvalues(value,startIndex,endIndex,values_list):
  #values_list=[]
  for i in range(startIndex,endIndex):
    values_list[i]=value
    #.append(value)
  #return values_list

def calculate_missing_values(values_list):
  filled_values=[]
  filled_positions=[]
  for i in range(len(values_list)):
    if(values_list[i].isdigit()):
      if(len(filled_positions) ==0):      
        missingvalues= int(int(values_list[i]) / (i+1))
        appendvalues(missingvalues,0,i+1,values_list)
      else:
        missingvalues= int((int(filled_values[len(filled_values)-1])+int(values_list[i])) / ((i+1)-filled_positions[len(filled_positions)-1]))
        appendvalues(missingvalues,filled_positions[len(filled_positions)-1],i+1,values_list)
      filled_positions.append(i)
      filled_values.append(int(values_list[i]))
  if(len(values_list) != filled_positions[len(filled_positions)-1]):
    missingvalues= int(int(values_list[filled_positions[len(filled_positions)-1]])/(len(values_list)- filled_positions[len(filled_positions)-1]))
    appendvalues(missingvalues,filled_positions[len(filled_positions)-1],len(values_list),values_list)
  return values_list

# you can free to change all these codes/structure
def curve_smoothing(string):
    # your code
    values_list = string.split(',')
    filled_values=calculate_missing_values(values_list)
    return filled_values#list of values

S=  "_,_,30,_,_,_,50,_,_"
smoothed_values= curve_smoothing(S)
print(smoothed_values)
ui7jx7zq

ui7jx7zq7#

“检查所有输入是否正常工作”
定义替换:

val=0
lst=s.split(",")

if lst[0].isdigit():
    for i in range(1,len(lst)):
        if lst[i].isdigit():
            value=(int(lst[0])+int(lst[i]))//((i+1))
            for j in range(0,i+1):
                lst[j]=value
            index=i
            break    
else:
    for i in range(len(s)):
        if lst[i].isdigit():
            for j in range(i+1):
                lst[j]=(int(lst[i]))//(i+1)
            index=i
            value=int(lst[i])
            break
for i in range(index+1,len(lst)):
    if lst[i].isdigit():
        temp=(value+int(lst[i]))//(i-index+1)
        for j in range(index,i+1):
            lst[j]=temp
        index=i
        value=int(lst[i])

try :
    for i in range(index+1,len(lst)):
        if not(lst[i].isdigit()):
            count=lst.count('_')
            break
    temp1=value//(count+1)
    for i in range(index,len(lst)):
        lst[i]=temp1
except UnboundLocalError as e:
    print (e)
return lst

相关问题