write down,forget
adidas eqt support ultra primeknit vintage white coming soon adidas eqt support ultra boost primeknit adidas eqt support ultra pk vintage white available now adidas eqt support ultra primeknit vintage white sz adidas eqt support ultra boost primeknit adidas eqt adv support primeknit adidas eqt support ultra boost turbo red white adidas eqt support ultra boost turbo red white adidas eqt support ultra boost turbo red adidas eqt support ultra whiteturbo adidas eqt support ultra boost off white more images adidas eqt support ultra boost white tactile green adidas eqt support ultra boost beige adidas eqt support ultra boost beige adidas eqt support refined camo drop adidas eqt support refined camo drop adidas eqt support refined running whitecamo adidas eqt support 93 primeknit og colorway ba7506 adidas eqt running support 93 adidas eqt support 93

热门话题,时间及空目录的处理

<Category: Hadoop, Linux> 查看评论

 

先查看hadoop目录的文件数,然后再决定是不是在input里面加上该目录
[dev@platformB dailyrawdata]$  fs -ls / |wc -l
3

计算时间的方法
[dev@platformB dailyrawdata]$ lastdate=20110619
[dev@platformB dailyrawdata]$ echo $lastdate
20110619
[dev@platformB dailyrawdata]$ echo date --date "-d $lastdate + 1day" +"%Y%m%d"
20110620

[dev@platformB dailyrawdata]$ echo D9=date --date "now -20 day" +"%Y%m%d"
D9=20110530

 

[dev@platformB dailyrawdata]$ D1=date --date "now" +"%Y/%m/%d"
[dev@platformB dailyrawdata]$ echo $D1
2011/06/20

注:等号后面不能有空格,如下面:

[dev@platformB dailyrawdata]$ D1= date --date "now" +"%Y/%m/%d"
-bash: 2011/06/20: No such file or directory

 

拷贝今天的文件到指定目录

DAYSTR=date --date "now" +"%Y/%m/%d"

hadoop fs -copyFromLocal dailyrawdata/* /trendingtopics/data/raw/$DAYSTR

 

慢着,当目录下文件为空的时候,Hadoop Stream Job的根据你指定的Input Pattern找不到文件的时候会抛异常,结果就造成了Job的失败。

找了半天也没有找到好的办法(那个知道比较好的办法,还请不吝赐教啊),只能先判断目录是否为空,为空则将文件夹重定向到一个空文件。

#touch blank file
BLANK=”/your folder/temp/blank”
hadoop fs -touchz $BLANK

#define a function to check hdfs files
function check_hdfs_files(){

#run hdfs command to check the files
hadoop fs -ls $1 &>/dev/null

#if file match is zero
#check file exists
if  [ $? -ne 0 ]
then
eval “$2=$BLANK”
echo “can’t find any files,use blank file instead”
fi

return $?
}

 

D0=date --date "now" +"/your folder/%Y/%m/%d/${APPNAME}-${TENANT}*"
D1=date --date "now -1 day" +"/your folder/%Y/%m/%d/$APPNAME-$TENANT*"

#check file exists
check_hdfs_files $D0 “D0”
check_hdfs_files $D1 “D1”

本文来自: 热门话题,时间及空目录的处理