« Several challenges confront economists wishing to take advantage of large, new data sets. These include gaining access to data, developing the data management and programming capabilities needed to work with large-scale data sets, and finally (and most importantly!) thinking of creative approaches to summarize, describe, and analyze the information contained in these data. » Einav, L. and Levin, J. - The Data Revolution and Economic Analysis, 2013
The main objective of the research is to build unbiased indices of European hedge fund performance. Reporting monthly performances in the hedge fund industry is not mandatory. As a consequence, managers can decide when and where (to what database or databases) they are going to report. Obviously, this choice is not exogenous. For example, new Funds decide to report as soon as they experience one-year (or sometimes even more) consecutive positive returns. The different steps of our research agenda are listed below:
1. Analysing the hedge funds by existing database: what hedge funds, what is their type, are they reporting to several databases or just to one, … to get a clear idea of the composition of each database
2. Analysing the reasons of their reporting choice
3. Studying the impact of this choice of the return performance for each strategy in database. What are their differences?
4. Propose and compute an index representative of the asset class performance in Europe.
This research agenda imposes to first construct a European funds manager’s database.
Understanding the behaviour of funds managers (hedge funds, UCITS, mutual funds) is key to explain the recent evolution of some markets but also to detect systemic risk factors (regulation implications) and potential conflicts of interests (financial markets law implications). But any analysis of that kind imposes to collect information from several sources and to access to big-data type of databases.
The main stream of research on hedge funds focuses on hedge funds managed by US companies (and regulated by the SEC). Heterogeneity and lack of ‘good-quality’ data explain this apparent lack of interest for hedge funds managed by European asset managers (and regulated by the ESMA and some local regulators). Moreover, the construction of such database is complicated and expensive. The first step consists in listing and eventually buying all the existing sources of data: commercial databases, data from the European regulators, and data to be collected on specialized websites. This latter data collection is already organized in the United-States, where teams of research assistants are hired to do the job. In a second step, this large amount of heterogeneous data should be aggregated across sources using big data methods and technics. These treatments are very much like those applied to medical records data. Medical researchers are mining vast sums of big data to discern new patterns in public health records and patients’ health histories. We want to mine vast sums of big data to discern new patterns in funds managers’ behaviour and strategies that impact price evolution and contagion between markets.
Academic disciplines differ, among many dimensions, in the extent to which they are populated mostly by men, or mostly by women. Building upon this well-established fact, this research examines the relationship between the "knowledge hierarchy" and the "gender hierarchy" across a range of social sciences. We first assess the past and present strength of this connection. We then explore how gender shapes the knowledge hierarchy on both ends: by structuring input (e.g., funneling men and women into different -- and differentially valued -- scientific trajectories, both across and within disciplines); and by structuring output (e.g., the relative devaluation of women's versus men's research topics and methods, the lesser citation of women reproduces gender hierarchies). To this end, we rely primarily on massive but curated bibliometric data to identify patterns and causalities.
Svitlana Galeshchuk et al. (Dauphine)
In this study, we focus on the institutional designs that would encourage people to provide high quality contents on social media and that could reduce negativity online. In particular, we examine how the verification of authentic identity on Twitter based on user consensus effects tweets’ content and the power of an average user to spread information.
Built on the knowledge and methodologies that we accumulated in collecting data from Twitter, we build a data base from streaming API, hashtag search, and retrieving historical tweets. Machine learning method is incorporated to evaluate the influential score of users. Semantic analysis is applied to investigate additional dimensions of information in the text contents.
We mimic the experiment by creating a control group through matching. More specifically, we match the features of the grassroots users that had got verified with those who did not get verified based on their tweeting behaviours previous to the initiation of the hashtags. It is assumed that conditional on the account activities and observed attributes that are considered as predetermined, the assignment of verification between the treatment group and the matched group is as good as random.
It is found that a hashtag that is initiated by a verified user is likely to have 76% more tweets that contain the hashtag in the first day on average. The average number of tweets in each phase with verified-initiated hashtags is around 44% higher compared to the case with non-verified initiated hashtags. By furthering studying users’ profile, we find that the impact of verification if more likely to be a result of third-party certification and an unintentional endorsement effect rather than information disclosure. Further, we argue that opening verification to all users would help to neutralise verification and to remove the endorsement effect that may complex the welfare effect.
Coordinator : Dianzhuo Zhu (Dauphine, DRM)
Recent years, ride sharing has grown rapidly from individual/community-based hitchhiking or government-oriented carpooling to large-scaled, platform-based ride sharing. Among different forms, platforms like Blablacar (France) and Gomore (Danmark) share their specificity. As long-distance ride sharing platforms, they motivate ordinary drivers to offer their empty seats in stead of hiring individuals as professional drivers. They allow drivers to settle their own price per journey, different from the surge pricing used by Uber, in which the algorithm automatically adapts price level to regulate supply in real time. They are mature platforms that have huge, available trip information online comparing to several competitors. Earlier questionnaire among Blablacar users done by Shaheen et al. (2017) has proven the existence of various motivations among drivers, including more extrinsic ones such as monetary motivation, and intrinsic ones such as solidarity. The working paper of Farajallah et al. (2016) has looked at the pricing behavior of drivers, but the no effort has been made in trying to identify various motivations statistically via their pricing behavior, and to categorize typical types of drivers. Previous research of the author on a short-distance, spontaneous ride sharing platform shows that the level of generosity may be related to the distance of the trip. This research aims to go further in identifying motivations via larger data sets and tries to answer some of the following questions: Are there "pro social" drivers who always set price at low level and "money-oriented" drivers who try to maximize their profit by adapting pricing behavior? If so, under which circumstances (age, gender, trip distance,geography,etc.)? Is trip experience perceived by passengers differ in these cases?
The Equipex Data for Financial History develops an infrastructure to collect, align and share data on all the assets traded on the Paris stock market and listed issuers over the 19th and 20th centuries. It registers twice per month spot, forward and options prices of all the securities (French and foreign; public and private; shares and bonds), currencies and precious metals, besides securities events relevant for harmonizing prices over time (dividends, coupons and ex- dates; number of listed securities; (reverse) split; etc.). It records information on issuers such as boards, headquarters, balance sheets and income statements, changes in equity capital, governance rules... The two main historical sources are the lists of the exchanges for market data and various kinds of yearbooks for data on issuers. DFIH aims at broadening its scope to include a larger variety of data. To achieve its goals, it contributes to the development of innovative technologies of data extraction and enrichment.
Coordinator : Julien Jourdan (M&O - DRM) - homepage.
Organizational scandals—broadly defined as publicized organizational transgressions that run counter to established norms—are ubiquitous phenomena in modern society, with wide-reaching consequences. With only a narrow set of cases and outcomes studied so far, the literature provides critical yet limited insights into the topic of organizational scandals, highlighting the need for more data and empirical research. This project aims at developing critical skills and knowledge about using social media to accumulate data on past and unfolding corporate scandals, and gain a better understanding of their dynamics.
This project investigates the reactions triggered by job-creation and job-destruction announcements on social networks.
Depuis une quarantaine d’années, les écarts de rémunération entre hommes et femmes ont donné lieu à une littérature abondante. Au fil du temps et d’un pays à l’autre, les articles dénoncent à la fois une inégalité de salaire, toutes choses égales par ailleurs, et une impossibilité pour les femmes à atteindre les échelons supérieurs de la hiérarchie (plafond de verre). Gobillon et al. (2015) mettent en évidence un différentiel de salaire de 16% dans la Fonction publique d’État, Les écarts de rémunération traduisent un effet dit « structurel » - les salariés ont des caractéristiques individuelles qui ne suivent pas nécessairement la même structure et ces caractéristiques influent le niveau de rémunération- et un effet qui renvoie aux traitements genrés des carrières et des rémunérations des femmes et des hommes.. En s'appuyant sur l'exploitation de données fournies par le ministère de l'Environnement, de l'Energie et de la Mer, ce projet s'interroge sur les différences dans les salaires et les carrières des employé(e)s de ce ministère. L'originalité et la richesse des données disponibles (4 ans de fiches de salaire pour 46000 individus plus 10 ans de récapitulatifs de carrière) permettra d'adopter
une approche interdisciplinaire, avec des outils issus des systèmes complexes et des techniques du data science et de l’apprentissage automatique. L'idée sera d'expliquer les différences de traitement mais aussi les différences de choix individuels professionnels. Ce projet, mené en collaboration avec le service "Egalité des droits entre les femmes et les hommes et de la lutte contre les discriminations" du ministère fait que l’étude n’a pas seulement un objectif de connaissance mais un objectif de politique publique, i.e. de remédiation des éventuelles discriminations constatées.
This project aims at better understanding the contributions of various stakeholders in the making of EU Regulations shaping economic activities. The single market policy has been a major driver or the reshaping of regulations in all domain for the last 30 years. While tremendous progresses have been made toward a deeper economic integration, many industries remain regulated and organized on a national basis and a continuous process of harmonization is witnessed. Also, new challenges — e.g. decarbonification, digital transformation — or central EU policies — e.g. inclusion, regional development —call on a continuous basis for additional regulatory initiatives.
The decision process managed in Brussel is central since it leads to the enactment of European Directives that are then transposed in national legislations. European regulations can also be passed and become immediately enforceable as law in all member states. The text of a draft directive (or regulation) is prepared by the Commission after consultation with its own and national experts. The draft is presented to the Parliament and the Council—composed of relevant ministers of member governments, initially for evaluation and comment, then subsequently for approval or rejection. While being required to consult Parliament on legislative proposals, the Council is not bound by Parliament's position.
Also, according to Article 11 of the Treaty on the European Union, ‘the European Commission shall carry out broad consultations with parties concerned in order to ensure that the Union’s actions are coherent and transparent’. It has led over the years to the development of formal processes by which the Commission collects input and views from stakeholders about its policies, which are now central in the lawmaking process in Brussels. Draft directives and regulations are now systematically preceded by such consultation of stakeholders. In parallel, the European Union has been implementing a set of rules and tools to prevent corruption in the performance of its own institutions. As a result, the consultation with the stakeholders is increasingly based upon formal and transparent procedures.
Overall, citizens and researchers now benefit of a set of formal, consistent and comprehensive sources of bureaucratic and parliamentary data on the decision process carried out in Brussels to elaborate the EU legislation. There are also complementary sources available on the process of transposition these legislations in each national legislation. Altogether these data can be relied upon to study the effective process of decision in Brussels, and in particular:
Not only the specificities of the European law-making process can be documented, but also we benefit of a set of original data to study in concrete terms how influence of the various interest groups influence the production of regulations. Until recently, only US data were available to document these process and data-intensive research on firms’ non market strategies and the political economy of regulation was based on US data only.
The project "Economy of digital platforms" is linked to the ANR project CAPLA "Workers on tap. The social impact of platform capitalism". It aims at analyzing the actors taking part in the development of digital platforms (Deliveroo, Etsy, La Belle assiette, Uber...) in order to understand the transformations of contemporary capitalism beyond the current debate polarized between the praise of the “sharing economy” and the denounciation of “uberization” as a new form of exploitation. In order to examine the different profiles of the workers offering their services on digital platforms, to identify their employment status and the diversity of their work activities, so as to understand how this new economy works, data from the digital platforms will be obtained through web scraping methods and analyzed through statistical methods. Etsy and La Belle assiette will be the first platforms investigated.
La question de recherche à laquelle le projet a pour ambition d’apporter des réponses est la suivante : l’analyse longitudinale des discours relatifs aux pratiques de management présentées comme innovantes sur le web permet-elle de mettre au jour une succession de paradigmes de management et de les caractériser ?
Le projet poursuit trois grands objectifs.
Patent policy in many countries addresses the possible impediments by exempting some research from infringement concerns. These exemptions take different forms (statutory vs. case law) and apply to different types of organizations or research. For example, most European countries restrict the use of patents to block research for
noncommercial purposes. In the United States, no such statutory exemption exists. However, the US and other countries allow generic drug drug firms to use patented materials to prepare applications for regulatory approval. There is little empirical work on the use and importance of these exemptions.
We investigate the potential effects of intellectual property and research exemptions on cumulative innovation in drug development. Cumulative innovation in this context may be finding new uses for existing treatments, or using existing treatments to establish the benefits of a new compound. In the absence of a license
from the patentholder, use of a patented drug in experiments may constitut infringement. The cost of obtaining a license and the risk of litigation increase the costs of cumulative innovation and may impede follow-on work.
Patent protection at the drug level varies over time and across countries; policies such as patent terms, patent exhaustion and research exemptions also vary across time and countries. We exploit this variation to identify the effect of patents on cumulative innovation.