How we calculateacademic statisticsfor an expert?

Zhe Wang,Jie Tang, and Bo Gao

(Alpha version)

#citation

       The number of citations of all publications by an expert.

#publication

       The number of all publications by an expert.

h -index

       An expert has index h if h of his or her N papers have at least h citations each, and the other (N − h) papers have at most h citations each.

Longevity

    Longevity reflects the length of one author’s academic life. We consider the year when one author published his/her first paper as the beginning year of his/her academic life and the last paper as the end year. Then longevity can be defined as:



Impact Factor

    Basically, the score reflects the importance of a publication venue and is calculated by http://en.wikipedia.org/wiki/Impact_factor. In Arnetminer, for evaluating a paper published at a venue, we further consider the paper length. Specifically, if the paper length < 3 pages, then we take 1/5 of impact factor of the publication venue; if the 3<= length <5 pages, we take 1/3 of the factor of the publication venue. Formally, the new score can be defined as follow:


       In the definition, G is a group of papers. IC (P) means the impact of conference in which the paper published. Weight(P) equals to 1/5 if the paper length<3, while weight(P) equals to 1/3 if the 3<= length <5.

Activity

    People'sactivityis simply defined based on one's papers published  in the last years. We consider the importance of each paper and thus define the activity score as:


       In the definition, in the year n (n belong to recent N years), G is a group of papers published by author A in the year n. Weight(n) = αthis year – n. We tentatively set the values for N and α. Specifically, we set N = 4 and α = 0.75, if the current month is in the first half year (month < July); and set N = 3 and α = 0.85 if the current month is in the second half year. 

Diversity

       Generally, an expert's research may include several different research fields.Diversityis defined to quantitatively reflect the degree. In particular, we first use the  author-conference-topic model (Tang, et al. 2008) to obtain the research fields for each expert.Then we automatically assign his papers to each topic. We calculate the topic distribution based on the assignment results. Given expert A, the research topic distribution  is defined as


       The author diversity is then defined as the entropy of this distribution:


Sociability

       The score of sociability is basically defined based on how many coauthors an expert has. We define the score as : 


       where #copapercdenotes the number of papers coauthored between the expert and the coauthor c.In the next step, we will further consider the location, organization, nationality information, and research fields.


Uptrend  

       Nothing can catch people's eyes more than a rising star. We useuptrend to define therising degreeof a researcher. The information of each author’s paper including the published date and conference's impact factor. We use Least Squares Method to fit a curve from published papers in recent N years. Then we use the curve to predict one's score in the next year, which is defined as the score of Uptrend, formally






In this definition, . N = 3. This year is not included if we calculate the value in the first half year, while this year is included and  if we calculate the value in the second half year.

       For all the 200 topics, we will calculate an author’s uptrend in each topic.